All of lore.kernel.org
 help / color / mirror / Atom feed
* Query about cpuidle
@ 2011-09-09 12:18 Andrew Cooper
  2011-09-09 12:50 ` Keir Fraser
  2011-09-13  3:19 ` Tian, Kevin
  0 siblings, 2 replies; 4+ messages in thread
From: Andrew Cooper @ 2011-09-09 12:18 UTC (permalink / raw)
  To: kevin.tian; +Cc: xen-devel@lists.xensource.com

Hello,

We have recently had a support escalation about Xen-4.1.1 being unable
to boot on HP BL460c G7 blades.  The problem turned out to be a null
function pointer deference (ns_to_tick in cpu_idle.c) during early boot
of dom0, in the set_cx_pminfo function.

I applied your patch, changeset 23662:2faba14bac13, about initializing
default C state information, and this appears to have fixed the problem.

However, I see in the patch that setting up the function pointers
(ns_to_tick, tick_to_ns etc) is predicated on the hypercall coming in on
CPU0.  What guarantees are in place to ensure that these function
pointers get set up?  I cant see anything obvious from the code, but
have to admit that the null pointer deference appears to have gone away.

Thanks in advance,

-- 
Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer
T: +44 (0)1223 225 900, http://www.citrix.com

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Query about cpuidle
  2011-09-09 12:18 Query about cpuidle Andrew Cooper
@ 2011-09-09 12:50 ` Keir Fraser
  2011-09-13  3:18   ` Tian, Kevin
  2011-09-13  3:19 ` Tian, Kevin
  1 sibling, 1 reply; 4+ messages in thread
From: Keir Fraser @ 2011-09-09 12:50 UTC (permalink / raw)
  To: Andrew Cooper, kevin.tian; +Cc: xen-devel@lists.xensource.com

On 09/09/2011 13:18, "Andrew Cooper" <andrew.cooper3@citrix.com> wrote:

> Hello,
> 
> We have recently had a support escalation about Xen-4.1.1 being unable
> to boot on HP BL460c G7 blades.  The problem turned out to be a null
> function pointer deference (ns_to_tick in cpu_idle.c) during early boot
> of dom0, in the set_cx_pminfo function.
> 
> I applied your patch, changeset 23662:2faba14bac13, about initializing
> default C state information, and this appears to have fixed the problem.
> 
> However, I see in the patch that setting up the function pointers
> (ns_to_tick, tick_to_ns etc) is predicated on the hypercall coming in on
> CPU0.

Firstly, it's predicated on the hypercall addressing CPU0, rather than being
executed on CPU0. Secondly, the cpuidle_init_cpu() functiomn is *also*
called from the CPU-hotplug path in Xen, and is called directly from the
presmp_initcall path for CPU0. I don't know why it is called both on a
hypercall path and on a hotplug path, it seems weird. But anyhow, this means
that the function pointers will guaranteed get set up early during Xen boot?

I can only sympathise and agree that the code is complicated and opaque,
however.

 -- Keir

>  What guarantees are in place to ensure that these function
> pointers get set up?  I cant see anything obvious from the code, but
> have to admit that the null pointer deference appears to have gone away.
> 
> Thanks in advance,

^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: Query about cpuidle
  2011-09-09 12:50 ` Keir Fraser
@ 2011-09-13  3:18   ` Tian, Kevin
  0 siblings, 0 replies; 4+ messages in thread
From: Tian, Kevin @ 2011-09-13  3:18 UTC (permalink / raw)
  To: Keir Fraser, Andrew Cooper; +Cc: xen-devel@lists.xensource.com

> From: Keir Fraser [mailto:keir.xen@gmail.com]
> Sent: Friday, September 09, 2011 8:50 PM
> 
> On 09/09/2011 13:18, "Andrew Cooper" <andrew.cooper3@citrix.com> wrote:
> 
> > Hello,
> >
> > We have recently had a support escalation about Xen-4.1.1 being unable
> > to boot on HP BL460c G7 blades.  The problem turned out to be a null
> > function pointer deference (ns_to_tick in cpu_idle.c) during early boot
> > of dom0, in the set_cx_pminfo function.
> >
> > I applied your patch, changeset 23662:2faba14bac13, about initializing
> > default C state information, and this appears to have fixed the problem.
> >
> > However, I see in the patch that setting up the function pointers
> > (ns_to_tick, tick_to_ns etc) is predicated on the hypercall coming in on
> > CPU0.
> 
> Firstly, it's predicated on the hypercall addressing CPU0, rather than being
> executed on CPU0. Secondly, the cpuidle_init_cpu() functiomn is *also*
> called from the CPU-hotplug path in Xen, and is called directly from the
> presmp_initcall path for CPU0. I don't know why it is called both on a
> hypercall path and on a hotplug path, it seems weird. But anyhow, this means
> that the function pointers will guaranteed get set up early during Xen boot?

yes. do it in hypercall path is b/c present ACPI processors may not be online
by Xen yet. do it in CPU-hotplug path is b/c we may want to do some resource
housekeeping according to CPU-hotplug event, though now only CPU_ONLINE
is handled. It's a bit overkilled, but should be ok for saneness reason.

Thanks
Kevin

^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: Query about cpuidle
  2011-09-09 12:18 Query about cpuidle Andrew Cooper
  2011-09-09 12:50 ` Keir Fraser
@ 2011-09-13  3:19 ` Tian, Kevin
  1 sibling, 0 replies; 4+ messages in thread
From: Tian, Kevin @ 2011-09-13  3:19 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel@lists.xensource.com

forgot to CC the list

> -----Original Message-----
> From: Tian, Kevin
> Sent: Tuesday, September 13, 2011 11:07 AM
> To: 'Andrew Cooper'
> Subject: RE: Query about cpuidle
> 
> > From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
> > Sent: Friday, September 09, 2011 8:18 PM
> >
> > Hello,
> >
> > We have recently had a support escalation about Xen-4.1.1 being unable
> > to boot on HP BL460c G7 blades.  The problem turned out to be a null
> > function pointer deference (ns_to_tick in cpu_idle.c) during early boot
> > of dom0, in the set_cx_pminfo function.
> 
> one possible reason is related to the order of how ACPI processor objects
> are found. CPU0 may not be always the 1st item there, and thus the
> ns_to_tick is not initialized accordingly.
> 
> you can easily verify this by adding printk in set_cx_pminfo.
> 
> >
> > I applied your patch, changeset 23662:2faba14bac13, about initializing
> > default C state information, and this appears to have fixed the problem.
> 
> with this patch function pointers are initialized when Xen is bringing up
> CPUs, which happens before dom0 and thus should avoid the problem
> 
> >
> > However, I see in the patch that setting up the function pointers
> > (ns_to_tick, tick_to_ns etc) is predicated on the hypercall coming in on
> > CPU0.  What guarantees are in place to ensure that these function
> > pointers get set up?  I cant see anything obvious from the code, but
> > have to admit that the null pointer deference appears to have gone away.
> 
> this is due to the cpu notifier coming in the latter part of the patch.
> 
> having said that, original logic looks error-prone, since there's no
> guarantee that cpu0 will be always firstly registered. We should just
> look at whether ns_to_tick is NULL, and if yes then do appropriate
> initialization at the 1st place, if you don't want to pull the whole 23662
> into your version.
> 
> Thanks
> Kevin
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2011-09-13  3:19 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-09-09 12:18 Query about cpuidle Andrew Cooper
2011-09-09 12:50 ` Keir Fraser
2011-09-13  3:18   ` Tian, Kevin
2011-09-13  3:19 ` Tian, Kevin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.