All of lore.kernel.org
 help / color / mirror / Atom feed
* xen_sched_clock problem in RHEL6b2, Was: [PATCH] x86: unconditionally mark TSC unstable under Xen
@ 2010-07-16 22:32 Dan Magenheimer
  2010-07-19  6:00 ` Michal Novotny
  0 siblings, 1 reply; 3+ messages in thread
From: Dan Magenheimer @ 2010-07-16 22:32 UTC (permalink / raw)
  To: Michal Novotny; +Cc: Jeremy Fitzhardinge, xen-devel, Jed Smith, Jan Beulich

Hi Michael --

Not sure if you are the right person for this, but FYI
it appears that this bug is also in RHEL6 beta 2 and
might account for some weird issues I've seen running
RHEL6 betas as Xen guests.

Problem description:
http://lists.xensource.com/archives/html/xen-devel/2010-07/msg00241.html 
Jeremy's solution (forwarded below):
http://lists.xensource.com/archives/html/xen-devel/2010-07/msg00749.html 

Dan

> -----Original Message-----
> From: Jeremy Fitzhardinge [mailto:jeremy@goop.org]
> Sent: Wednesday, July 14, 2010 3:36 PM
> To: Jed Smith
> Cc: xen-devel@lists.xensource.com; Jan Beulich
> Subject: [Xen-devel] Re: [PATCH] x86: unconditionally mark TSC unstable
> under Xen
> 
> On 07/14/2010 12:24 PM, Jed Smith wrote:
> > Jeremy, Jan - what do you think?  Is this a bad move?  I feel like
> there
> > is a consequence to this that I am unaware of, but it fixes my issue.
> >
> 
> Ah, well that's interesting.
> 
> There's a couple of comments:
> 
>    1. you can't do this with just a compile-time test, since the same
>       kernel can also boot native
>    2. nothing in a Xen PV domU environment should be using the tsc
>       directly, so this shouldn't have an effect.  If something is
> using
>       the tsc we should track it down.
> 
> I wonder, however, if you're getting the same result as Jan's
> suggestion
> of making sched_clock unstable by making the tsc unstable.
> 
> In that case, this patch may help:
> 
> Subject: [PATCH] xen: disable xen_sched_clock by default
> 
> xen_sched_clock only counts unstolen time.  In principle this should
> be useful to the Linux scheduler so that it knows how much time a
> process
> actually consumed.  But in practice this doesn't work very well as the
> scheduler expects the sched_clock time to be synchronized between
> cpus.  It also uses sched_clock to measure the time a task spends
> sleeping, in which case "unstolen time" isn't meaningful.
> 
> So just use plain xen_clocksource_read to return wallclock nanoseconds
> for sched_clock.
> 
> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
> 
> diff --git a/arch/x86/xen/Kconfig b/arch/x86/xen/Kconfig
> index b83e119..6a09f01 100644
> --- a/arch/x86/xen/Kconfig
> +++ b/arch/x86/xen/Kconfig
> @@ -29,6 +29,10 @@ config XEN_SAVE_RESTORE
>         depends on XEN && PM
>         default y
> 
> +config XEN_SCHED_CLOCK
> +       bool
> +       default n
> +
>  config XEN_DEBUG_FS
>  	bool "Enable Xen debug and tuning parameters in debugfs"
>  	depends on XEN && DEBUG_FS
> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
> index 4b57c0b..242a230 100644
> --- a/arch/x86/xen/enlighten.c
> +++ b/arch/x86/xen/enlighten.c
> @@ -939,7 +939,11 @@ static const struct pv_time_ops xen_time_ops
> __initdata = {
>  	.set_wallclock = xen_set_wallclock,
>  	.get_wallclock = xen_get_wallclock,
>  	.get_tsc_khz = xen_tsc_khz,
> +#ifdef CONFIG_XEN_SCHED_CLOCK
>  	.sched_clock = xen_sched_clock,
> +#else
> +	.sched_clock = xen_clocksource_read,
> +#endif
>  };
> 
>  static const struct pv_cpu_ops xen_cpu_ops __initdata = {
> diff --git a/arch/x86/xen/time.c b/arch/x86/xen/time.c
> index 9d1f853..32dc3dc 100644
> --- a/arch/x86/xen/time.c
> +++ b/arch/x86/xen/time.c
> @@ -154,6 +154,7 @@ static void do_stolen_accounting(void)
>  	account_idle_ticks(ticks);
>  }
> 
> +#ifdef CONFIG_XEN_SCHED_CLOCK
>  /*
>   * Xen sched_clock implementation.  Returns the number of unstolen
>   * nanoseconds, which is nanoseconds the VCPU spent in RUNNING+BLOCKED
> @@ -191,7 +192,7 @@ unsigned long long xen_sched_clock(void)
> 
>  	return ret;
>  }
> -
> +#endif
> 
>  /* Get the TSC speed from Xen */
>  unsigned long xen_tsc_khz(void)
> 
> Thanks,
> 	J
> 
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: xen_sched_clock problem in RHEL6b2, Was: [PATCH] x86: unconditionally mark TSC unstable under Xen
  2010-07-16 22:32 xen_sched_clock problem in RHEL6b2, Was: [PATCH] x86: unconditionally mark TSC unstable under Xen Dan Magenheimer
@ 2010-07-19  6:00 ` Michal Novotny
  2010-07-19  7:00   ` Andrew Jones
  0 siblings, 1 reply; 3+ messages in thread
From: Michal Novotny @ 2010-07-19  6:00 UTC (permalink / raw)
  To: Dan Magenheimer
  Cc: Jeremy Fitzhardinge, xen-devel, Andrew Jones, Jed Smith,
	Jan Beulich

Hi Dan,
since it looks like this is the kernel/hypervisor bug I'm not really the 
right one to ask. You should better as Andrew Jones - CCed now (his 
e-mail is drjones@redhat.com just for sure) - since he's working in the 
kernel-xen area and therefore even the RHEL-6 kernel so I guess he's a 
good guy to ask about this one. Also, according to the links it seems 
you're running RHEL-6 beta 2 as PV guest, right ? Are you running this 
on Xen-4.1-unstable or what version?

Andrew, could you please have a look this one? I don't know whether it's 
RHEL-6 kernel bug itself or whether there's some bug in the hypervisor, 
nevetheless both parts belongs to the same component - kernel-xen - 
you're working on so I guess you're a good guy to ask. Please, leave me 
in the CC list as I'm interested as well.

Thanks,
Michal

On 07/17/2010 12:32 AM, Dan Magenheimer wrote:
> Hi Michael --
>
> Not sure if you are the right person for this, but FYI
> it appears that this bug is also in RHEL6 beta 2 and
> might account for some weird issues I've seen running
> RHEL6 betas as Xen guests.
>
> Problem description:
> http://lists.xensource.com/archives/html/xen-devel/2010-07/msg00241.html
> Jeremy's solution (forwarded below):
> http://lists.xensource.com/archives/html/xen-devel/2010-07/msg00749.html
>
> Dan
>
>    
>> -----Original Message-----
>> From: Jeremy Fitzhardinge [mailto:jeremy@goop.org]
>> Sent: Wednesday, July 14, 2010 3:36 PM
>> To: Jed Smith
>> Cc: xen-devel@lists.xensource.com; Jan Beulich
>> Subject: [Xen-devel] Re: [PATCH] x86: unconditionally mark TSC unstable
>> under Xen
>>
>> On 07/14/2010 12:24 PM, Jed Smith wrote:
>>      
>>> Jeremy, Jan - what do you think?  Is this a bad move?  I feel like
>>>        
>> there
>>      
>>> is a consequence to this that I am unaware of, but it fixes my issue.
>>>
>>>        
>> Ah, well that's interesting.
>>
>> There's a couple of comments:
>>
>>     1. you can't do this with just a compile-time test, since the same
>>        kernel can also boot native
>>     2. nothing in a Xen PV domU environment should be using the tsc
>>        directly, so this shouldn't have an effect.  If something is
>> using
>>        the tsc we should track it down.
>>
>> I wonder, however, if you're getting the same result as Jan's
>> suggestion
>> of making sched_clock unstable by making the tsc unstable.
>>
>> In that case, this patch may help:
>>
>> Subject: [PATCH] xen: disable xen_sched_clock by default
>>
>> xen_sched_clock only counts unstolen time.  In principle this should
>> be useful to the Linux scheduler so that it knows how much time a
>> process
>> actually consumed.  But in practice this doesn't work very well as the
>> scheduler expects the sched_clock time to be synchronized between
>> cpus.  It also uses sched_clock to measure the time a task spends
>> sleeping, in which case "unstolen time" isn't meaningful.
>>
>> So just use plain xen_clocksource_read to return wallclock nanoseconds
>> for sched_clock.
>>
>> Signed-off-by: Jeremy Fitzhardinge<jeremy.fitzhardinge@citrix.com>
>>
>> diff --git a/arch/x86/xen/Kconfig b/arch/x86/xen/Kconfig
>> index b83e119..6a09f01 100644
>> --- a/arch/x86/xen/Kconfig
>> +++ b/arch/x86/xen/Kconfig
>> @@ -29,6 +29,10 @@ config XEN_SAVE_RESTORE
>>          depends on XEN&&  PM
>>          default y
>>
>> +config XEN_SCHED_CLOCK
>> +       bool
>> +       default n
>> +
>>   config XEN_DEBUG_FS
>>   	bool "Enable Xen debug and tuning parameters in debugfs"
>>   	depends on XEN&&  DEBUG_FS
>> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
>> index 4b57c0b..242a230 100644
>> --- a/arch/x86/xen/enlighten.c
>> +++ b/arch/x86/xen/enlighten.c
>> @@ -939,7 +939,11 @@ static const struct pv_time_ops xen_time_ops
>> __initdata = {
>>   	.set_wallclock = xen_set_wallclock,
>>   	.get_wallclock = xen_get_wallclock,
>>   	.get_tsc_khz = xen_tsc_khz,
>> +#ifdef CONFIG_XEN_SCHED_CLOCK
>>   	.sched_clock = xen_sched_clock,
>> +#else
>> +	.sched_clock = xen_clocksource_read,
>> +#endif
>>   };
>>
>>   static const struct pv_cpu_ops xen_cpu_ops __initdata = {
>> diff --git a/arch/x86/xen/time.c b/arch/x86/xen/time.c
>> index 9d1f853..32dc3dc 100644
>> --- a/arch/x86/xen/time.c
>> +++ b/arch/x86/xen/time.c
>> @@ -154,6 +154,7 @@ static void do_stolen_accounting(void)
>>   	account_idle_ticks(ticks);
>>   }
>>
>> +#ifdef CONFIG_XEN_SCHED_CLOCK
>>   /*
>>    * Xen sched_clock implementation.  Returns the number of unstolen
>>    * nanoseconds, which is nanoseconds the VCPU spent in RUNNING+BLOCKED
>> @@ -191,7 +192,7 @@ unsigned long long xen_sched_clock(void)
>>
>>   	return ret;
>>   }
>> -
>> +#endif
>>
>>   /* Get the TSC speed from Xen */
>>   unsigned long xen_tsc_khz(void)
>>
>> Thanks,
>> 	J
>>
>>
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xensource.com
>> http://lists.xensource.com/xen-devel
>>      


-- 
Michal Novotny<minovotn@redhat.com>, RHCE
Virtualization Team (xen userspace), Red Hat

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Re: xen_sched_clock problem in RHEL6b2, Was: [PATCH] x86: unconditionally mark TSC unstable under Xen
  2010-07-19  6:00 ` Michal Novotny
@ 2010-07-19  7:00   ` Andrew Jones
  0 siblings, 0 replies; 3+ messages in thread
From: Andrew Jones @ 2010-07-19  7:00 UTC (permalink / raw)
  To: Michal Novotny
  Cc: Dan Magenheimer, xen-devel, Jeremy Fitzhardinge, Jan Beulich,
	Jed Smith

On 07/19/2010 08:00 AM, Michal Novotny wrote:
> Andrew, could you please have a look this one? I don't know whether it's
> RHEL-6 kernel bug itself or whether there's some bug in the hypervisor,
> nevetheless both parts belongs to the same component - kernel-xen -
> you're working on so I guess you're a good guy to ask. Please, leave me
> in the CC list as I'm interested as well.

Yup, I've been watching this thread. I haven't seen anything strange
with my rhel6 testing yet, but Jeremy's explanation makes sense, so I'm
considering the patch.

Andrew

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2010-07-19  7:00 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-07-16 22:32 xen_sched_clock problem in RHEL6b2, Was: [PATCH] x86: unconditionally mark TSC unstable under Xen Dan Magenheimer
2010-07-19  6:00 ` Michal Novotny
2010-07-19  7:00   ` Andrew Jones

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.