From: Steve Ofsthun <steve.ofsthun@oracle.com>
To: Mukesh Rathor <mukesh.rathor@oracle.com>
Cc: Dan Magenheimer <dan.magenheimer@oracle.com>,
"Xen-devel@lists.xensource.com" <Xen-devel@lists.xensource.com>,
Hackel <kurt.hackel@oracle.com>,
jeremy@goop.org, Keir Fraser <keir.fraser@eu.citrix.com>,
Kurt@acsinet11.oracle.com
Subject: Re: [timer/ticks related] dom0 hang during boot on large 1TB system
Date: Mon, 21 Dec 2009 14:17:57 -0500 [thread overview]
Message-ID: <4B2FC9E5.4050001@oracle.com> (raw)
In-Reply-To: <20091218204318.180e58f3@mantra.us.oracle.com>
Mukesh Rathor wrote:
> On Fri, 18 Dec 2009 07:02:55 +0000
> Keir Fraser <keir.fraser@eu.citrix.com> wrote:
>
>> On 18/12/2009 04:36, "Mukesh Rathor" <mukesh.rathor@oracle.com> wrote:
>>
>>> The other fix I thought of was to change INITIAL_JIFFIES to
>>> something sooner.
>>>
>>> Would appreciate any help, I don't understand xen time management
>>> well.
>> This isn't really Xen time code, but unchanged Linux time code. I
>> don't know which tree you quoted the code from -- 2.6.18 has similar
>> but not identical. Anyway, I suggest try using the jiffy-comparison
>> macros from <linux/jiffies.h>: time_before(), time_after(), etc.
>> These are designed to work even when jiffies wraps. Feel free to send
>> patch(es) for that, if you test that out and it works okay.
>>
>> -- Keir
>>
>
> Ok, I came up with the following patch. Jeremy, can you please take a
> look also, and comment on my fix since I noticed you've got the same
> issue in your tree. Here's a summary for your benefit:
>
> init/calibrate.c : calibrate_delay_direct():
>
> start_jiffies = get_jiffies_64();
> while (get_jiffies_64() <= (start_jiffies + tick_divider)) {
> pre_start = start;
> read_current_timer(&start);
> }
>
Linux time code explicitly forces jiffies (32-bit) to wrap soon after boot to prevent other kernel code from making assumptions about jiffies wrap. In your case, I'm guessing that the scrubbing delay is causing a sufficient number of timer interrupts to be delayed (queued up) that it is forcing the jiffies to wrap earlier in the boot path than expected.
As Keir suggests, the correct solution is probably to use the time_before/after macros appropriately.
The proposed code avoids the problem by accessing jiffies_64 instead.
> if first ever timer interrupt comes after start_jiffies is set, dom0 boot
> may hang if delta in timer_interrupt() is so huge that it causes jiffies
> to wrap. It appears delta is very large when memory is more than 512GB on
> certain boxes causing wrap around.
>
> why is delta in dom0->timer_interrupt() related to memory on system?
> Because hyp creates dom0, then page scrubs, then unpauses vcpu. so it
> appears lot of page scurbbing results in huge delta on first tick.
The problem here may be that timers are running in the domain while the vcpu is not.
Steve
next prev parent reply other threads:[~2009-12-21 19:17 UTC|newest]
Thread overview: 51+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-12-18 4:36 [timer/ticks related] dom0 hang during boot on large 1TB system Mukesh Rathor
2009-12-18 7:02 ` Keir Fraser
2009-12-18 8:42 ` Jan Beulich
2009-12-18 9:13 ` Keir Fraser
2009-12-18 16:35 ` Dan Magenheimer
2009-12-18 17:15 ` Keir Fraser
2009-12-18 19:28 ` Mukesh Rathor
2009-12-18 19:25 ` Mukesh Rathor
2009-12-19 4:43 ` Mukesh Rathor
2009-12-21 9:55 ` Jan Beulich
2009-12-21 18:20 ` Dan Magenheimer
2009-12-21 19:07 ` Keir Fraser
2009-12-21 19:52 ` Mukesh Rathor
2009-12-21 19:55 ` Jeremy Fitzhardinge
2009-12-21 22:47 ` Mukesh Rathor
2009-12-21 23:13 ` Jeremy Fitzhardinge
2009-12-21 23:57 ` Dan Magenheimer
2009-12-22 4:31 ` Mukesh Rathor
2009-12-22 8:51 ` Jan Beulich
2009-12-22 10:20 ` Keir Fraser
2009-12-22 11:10 ` Jan Beulich
2009-12-22 13:35 ` Keir Fraser
2009-12-22 14:17 ` Jan Beulich
2009-12-22 14:23 ` Jan Beulich
2009-12-22 15:19 ` Keir Fraser
2009-12-22 15:30 ` Dan Magenheimer
2009-12-22 15:36 ` Jan Beulich
2009-12-22 16:05 ` Dan Magenheimer
2009-12-22 17:02 ` Jan Beulich
2009-12-22 18:03 ` Jeremy Fitzhardinge
2010-01-04 8:23 ` Jan Beulich
2010-01-04 22:07 ` Dan Magenheimer
2010-01-04 22:21 ` Ian Campbell
2010-01-05 8:33 ` Jan Beulich
2010-01-05 15:46 ` Dan Magenheimer
2010-01-05 15:54 ` Ian Campbell
2010-01-05 16:08 ` Jan Beulich
2009-12-22 16:33 ` Jan Beulich
2009-12-22 16:42 ` Jan Beulich
2009-12-22 17:27 ` Dan Magenheimer
2009-12-22 17:48 ` Keir Fraser
2009-12-22 18:42 ` Keir Fraser
2009-12-22 23:00 ` Mukesh Rathor
2009-12-21 10:44 ` Keir Fraser
2009-12-21 23:40 ` Mukesh Rathor
2009-12-22 7:35 ` Keir Fraser
2009-12-21 19:17 ` Steve Ofsthun [this message]
2009-12-22 4:00 ` Mukesh Rathor
2009-12-22 4:18 ` Mukesh Rathor
2009-12-22 7:59 ` Keir Fraser
2009-12-22 8:05 ` Keir Fraser
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4B2FC9E5.4050001@oracle.com \
--to=steve.ofsthun@oracle.com \
--cc=Kurt@acsinet11.oracle.com \
--cc=Xen-devel@lists.xensource.com \
--cc=dan.magenheimer@oracle.com \
--cc=jeremy@goop.org \
--cc=keir.fraser@eu.citrix.com \
--cc=kurt.hackel@oracle.com \
--cc=mukesh.rathor@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.