* [PATCH] raise tsc clocksource rating
@ 2007-10-29 23:10 Glauber de Oliveira Costa
[not found] ` <11936994092607-git-send-email-gcosta-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
0 siblings, 1 reply; 29+ messages in thread
From: Glauber de Oliveira Costa @ 2007-10-29 23:10 UTC (permalink / raw)
To: linux-kernel-u79uwXL29TY76Z2rM5mHXA
Cc: jeremy-TSDbQ3PG+2Y, avi-i/T1Zxg06OZWk0Htik3J/w,
kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f,
Glauber de Oliveira Costa, Glauber de Oliveira Costa,
--cc-H+wXaHxf7aLQT0dZR+AlfA
From: Glauber de Oliveira Costa <glauber-aMxvdZfKFXomYvmMESoHnA@public.gmane.org>
tsc is very good time source (when it does not have drifts, does not
change it's frequency, i.e. when it works), so it should have its rating
raised to a value greater than, or equal 400.
Since it's being a tendency among paravirt clocksources to use values
around 400, we should declare tsc as even better: So we use 500.
This patch also touches the comments on clocksource.h, which suggests
that 499 would be a limit on the rating values.
Signed-off-by: Glauber de Oliveira Costa <gcosta-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
---
arch/x86/kernel/tsc_32.c | 2 +-
arch/x86/kernel/tsc_64.c | 2 +-
include/linux/clocksource.h | 2 +-
3 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kernel/tsc_32.c b/arch/x86/kernel/tsc_32.c
index 9ebc0da..4d91e59 100644
--- a/arch/x86/kernel/tsc_32.c
+++ b/arch/x86/kernel/tsc_32.c
@@ -280,7 +280,7 @@ static cycle_t read_tsc(void)
static struct clocksource clocksource_tsc = {
.name = "tsc",
- .rating = 300,
+ .rating = 500,
.read = read_tsc,
.mask = CLOCKSOURCE_MASK(64),
.mult = 0, /* to be set */
diff --git a/arch/x86/kernel/tsc_64.c b/arch/x86/kernel/tsc_64.c
index 9c70af4..4fd5b1b 100644
--- a/arch/x86/kernel/tsc_64.c
+++ b/arch/x86/kernel/tsc_64.c
@@ -262,7 +262,7 @@ static cycle_t __vsyscall_fn vread_tsc(void)
static struct clocksource clocksource_tsc = {
.name = "tsc",
- .rating = 300,
+ .rating = 500,
.read = read_tsc,
.mask = CLOCKSOURCE_MASK(64),
.shift = 22,
diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h
index 107787a..5b0aadd 100644
--- a/include/linux/clocksource.h
+++ b/include/linux/clocksource.h
@@ -39,7 +39,7 @@ struct clocksource;
* A correct and usable clocksource.
* 300-399: Desired.
* A reasonably fast and accurate clocksource.
- * 400-499: Perfect
+ * >= 400 : Perfect
* The ideal clocksource. A must-use where
* available.
* @read: returns a cycle value
--
1.5.0.6
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
^ permalink raw reply related [flat|nested] 29+ messages in thread[parent not found: <11936994092607-git-send-email-gcosta-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH] raise tsc clocksource rating [not found] ` <11936994092607-git-send-email-gcosta-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2007-10-29 22:17 ` Thomas Gleixner 2007-10-29 22:36 ` Ingo Molnar ` (2 more replies) 2007-10-29 22:42 ` Zachary Amsden 2007-10-30 0:17 ` H. Peter Anvin 2 siblings, 3 replies; 29+ messages in thread From: Thomas Gleixner @ 2007-10-29 22:17 UTC (permalink / raw) To: Glauber de Oliveira Costa Cc: Jeremy Fitzhardinge, avi-i/T1Zxg06OZWk0Htik3J/w, kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, John Stultz, LKML, --cc-H+wXaHxf7aLQT0dZR+AlfA On Mon, 29 Oct 2007, Glauber de Oliveira Costa wrote: CC'ed John and removed glauber-aMxvdZfKFXomYvmMESoHnA@public.gmane.org :) > From: Glauber de Oliveira Costa <glauber-aMxvdZfKFXomYvmMESoHnA@public.gmane.org> > > tsc is very good time source (when it does not have drifts, does not > change it's frequency, i.e. when it works), so it should have its rating > raised to a value greater than, or equal 400. > > Since it's being a tendency among paravirt clocksources to use values > around 400, we should declare tsc as even better: So we use 500. > > This patch also touches the comments on clocksource.h, which suggests > that 499 would be a limit on the rating values. > > Signed-off-by: Glauber de Oliveira Costa <gcosta-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> Acked-by: Thomas Gleixner <tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org> > --- > arch/x86/kernel/tsc_32.c | 2 +- > arch/x86/kernel/tsc_64.c | 2 +- > include/linux/clocksource.h | 2 +- > 3 files changed, 3 insertions(+), 3 deletions(-) > > diff --git a/arch/x86/kernel/tsc_32.c b/arch/x86/kernel/tsc_32.c > index 9ebc0da..4d91e59 100644 > --- a/arch/x86/kernel/tsc_32.c > +++ b/arch/x86/kernel/tsc_32.c > @@ -280,7 +280,7 @@ static cycle_t read_tsc(void) > > static struct clocksource clocksource_tsc = { > .name = "tsc", > - .rating = 300, > + .rating = 500, > .read = read_tsc, > .mask = CLOCKSOURCE_MASK(64), > .mult = 0, /* to be set */ > diff --git a/arch/x86/kernel/tsc_64.c b/arch/x86/kernel/tsc_64.c > index 9c70af4..4fd5b1b 100644 > --- a/arch/x86/kernel/tsc_64.c > +++ b/arch/x86/kernel/tsc_64.c > @@ -262,7 +262,7 @@ static cycle_t __vsyscall_fn vread_tsc(void) > > static struct clocksource clocksource_tsc = { > .name = "tsc", > - .rating = 300, > + .rating = 500, > .read = read_tsc, > .mask = CLOCKSOURCE_MASK(64), > .shift = 22, > diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h > index 107787a..5b0aadd 100644 > --- a/include/linux/clocksource.h > +++ b/include/linux/clocksource.h > @@ -39,7 +39,7 @@ struct clocksource; > * A correct and usable clocksource. > * 300-399: Desired. > * A reasonably fast and accurate clocksource. > - * 400-499: Perfect > + * >= 400 : Perfect > * The ideal clocksource. A must-use where > * available. > * @read: returns a cycle value > -- > 1.5.0.6 > > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] raise tsc clocksource rating 2007-10-29 22:17 ` Thomas Gleixner @ 2007-10-29 22:36 ` Ingo Molnar 2007-10-30 1:26 ` john stultz [not found] ` <alpine.LFD.0.9999.0710292315280.3186-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org> 2 siblings, 0 replies; 29+ messages in thread From: Ingo Molnar @ 2007-10-29 22:36 UTC (permalink / raw) To: Thomas Gleixner Cc: Glauber de Oliveira Costa, LKML, Rusty Russell, Jeremy Fitzhardinge, --cc, avi, kvm-devel, John Stultz * Thomas Gleixner <tglx@linutronix.de> wrote: > > Since it's being a tendency among paravirt clocksources to use > > values around 400, we should declare tsc as even better: So we use > > 500. > > > > This patch also touches the comments on clocksource.h, which > > suggests that 499 would be a limit on the rating values. > > > > Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com> > > Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ingo Molnar <mingo@elte.hu> Ingo ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] raise tsc clocksource rating 2007-10-29 22:17 ` Thomas Gleixner 2007-10-29 22:36 ` Ingo Molnar @ 2007-10-30 1:26 ` john stultz [not found] ` <alpine.LFD.0.9999.0710292315280.3186-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org> 2 siblings, 0 replies; 29+ messages in thread From: john stultz @ 2007-10-30 1:26 UTC (permalink / raw) To: Thomas Gleixner Cc: Glauber de Oliveira Costa, LKML, Rusty Russell, Jeremy Fitzhardinge, --cc, Ingo Molnar, avi, kvm-devel On Mon, 2007-10-29 at 23:17 +0100, Thomas Gleixner wrote: > On Mon, 29 Oct 2007, Glauber de Oliveira Costa wrote: > > CC'ed John and removed glauber@t60.localdomain :) > > > From: Glauber de Oliveira Costa <glauber@t60.localdomain> > > > > tsc is very good time source (when it does not have drifts, does not > > change it's frequency, i.e. when it works), so it should have its rating > > raised to a value greater than, or equal 400. But all of those qualities (and more) make it less then ideal. What issue exactly does this patch resolve? > > Since it's being a tendency among paravirt clocksources to use values > > around 400, we should declare tsc as even better: So we use 500. This is the rating inflation I was initially worried about and created the advisory scale to avoid. If paravirt clocksources are rated too high, they should be dropped. I strongly disagree that the TSC is a "perfect" clocksource, and I think its rating is appropriate. > > This patch also touches the comments on clocksource.h, which suggests > > that 499 would be a limit on the rating values. > > > > Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com> > > Acked-by: Thomas Gleixner <tglx@linutronix.de> If a better justification can be provided, I might reconsider, but right now I can't ack this. thanks -john > > --- > > arch/x86/kernel/tsc_32.c | 2 +- > > arch/x86/kernel/tsc_64.c | 2 +- > > include/linux/clocksource.h | 2 +- > > 3 files changed, 3 insertions(+), 3 deletions(-) > > > > diff --git a/arch/x86/kernel/tsc_32.c b/arch/x86/kernel/tsc_32.c > > index 9ebc0da..4d91e59 100644 > > --- a/arch/x86/kernel/tsc_32.c > > +++ b/arch/x86/kernel/tsc_32.c > > @@ -280,7 +280,7 @@ static cycle_t read_tsc(void) > > > > static struct clocksource clocksource_tsc = { > > .name = "tsc", > > - .rating = 300, > > + .rating = 500, > > .read = read_tsc, > > .mask = CLOCKSOURCE_MASK(64), > > .mult = 0, /* to be set */ > > diff --git a/arch/x86/kernel/tsc_64.c b/arch/x86/kernel/tsc_64.c > > index 9c70af4..4fd5b1b 100644 > > --- a/arch/x86/kernel/tsc_64.c > > +++ b/arch/x86/kernel/tsc_64.c > > @@ -262,7 +262,7 @@ static cycle_t __vsyscall_fn vread_tsc(void) > > > > static struct clocksource clocksource_tsc = { > > .name = "tsc", > > - .rating = 300, > > + .rating = 500, > > .read = read_tsc, > > .mask = CLOCKSOURCE_MASK(64), > > .shift = 22, > > diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h > > index 107787a..5b0aadd 100644 > > --- a/include/linux/clocksource.h > > +++ b/include/linux/clocksource.h > > @@ -39,7 +39,7 @@ struct clocksource; > > * A correct and usable clocksource. > > * 300-399: Desired. > > * A reasonably fast and accurate clocksource. > > - * 400-499: Perfect > > + * >= 400 : Perfect > > * The ideal clocksource. A must-use where > > * available. > > * @read: returns a cycle value > > -- > > 1.5.0.6 > > > > - > > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > Please read the FAQ at http://www.tux.org/lkml/ > > ^ permalink raw reply [flat|nested] 29+ messages in thread
[parent not found: <alpine.LFD.0.9999.0710292315280.3186-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>]
* Re: [PATCH] raise tsc clocksource rating [not found] ` <alpine.LFD.0.9999.0710292315280.3186-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org> @ 2007-10-30 2:39 ` Rusty Russell [not found] ` <200710301339.23628.rusty-8n+1lVoiYb80n/F98K4Iww@public.gmane.org> 0 siblings, 1 reply; 29+ messages in thread From: Rusty Russell @ 2007-10-30 2:39 UTC (permalink / raw) To: Thomas Gleixner Cc: Jeremy Fitzhardinge, avi-i/T1Zxg06OZWk0Htik3J/w, kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, John Stultz, LKML, Glauber de Oliveira Costa, --cc-H+wXaHxf7aLQT0dZR+AlfA On Tuesday 30 October 2007 09:17:38 Thomas Gleixner wrote: > On Mon, 29 Oct 2007, Glauber de Oliveira Costa wrote: > > CC'ed John and removed glauber-aMxvdZfKFXomYvmMESoHnA@public.gmane.org :) > > > From: Glauber de Oliveira Costa <glauber-aMxvdZfKFXomYvmMESoHnA@public.gmane.org> > > > > tsc is very good time source (when it does not have drifts, does not > > change it's frequency, i.e. when it works), so it should have its rating > > raised to a value greater than, or equal 400. > > > > Since it's being a tendency among paravirt clocksources to use values > > around 400, we should declare tsc as even better: So we use 500. > > > > This patch also touches the comments on clocksource.h, which suggests > > that 499 would be a limit on the rating values. > > > > Signed-off-by: Glauber de Oliveira Costa <gcosta-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> > > Acked-by: Thomas Gleixner <tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org> No. tsc is very good, it's not perfect. If a paravirt clock registers 400 it really means "pick me over the tsc". That's *why* they use > 400: it's in the documentation. Rusty. ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ^ permalink raw reply [flat|nested] 29+ messages in thread
[parent not found: <200710301339.23628.rusty-8n+1lVoiYb80n/F98K4Iww@public.gmane.org>]
* Re: [PATCH] raise tsc clocksource rating [not found] ` <200710301339.23628.rusty-8n+1lVoiYb80n/F98K4Iww@public.gmane.org> @ 2007-10-30 7:37 ` Ingo Molnar [not found] ` <20071030073736.GA21843-X9Un+BFzKDI@public.gmane.org> 0 siblings, 1 reply; 29+ messages in thread From: Ingo Molnar @ 2007-10-30 7:37 UTC (permalink / raw) To: Rusty Russell Cc: Jeremy Fitzhardinge, avi-i/T1Zxg06OZWk0Htik3J/w, kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, John Stultz, LKML, Glauber de Oliveira Costa, --cc-H+wXaHxf7aLQT0dZR+AlfA * Rusty Russell <rusty-8n+1lVoiYb80n/F98K4Iww@public.gmane.org> wrote: > No. tsc is very good, it's not perfect. If a paravirt clock > registers 400 it really means "pick me over the tsc". often the TSC is not perfect, but _IF_ it's perfect, using the paravirt driver is a pessimisation in performance. the main problem at the moment is that there's no mechanism at the moment to convey to the guest the information that the TSC is "perfect", and to convey the calibration values. > That's *why* they use > 400: it's in the documentation. static values do not capture conditional quality like that of the TSC. and just in case it's not obvious: i am not arguing for the inclusion of the patch, i'm just pointing out the plain fact that in the case where the TSC _is_ reliable, 5 different clocksource drivers for has obvious disadvantages. Anyone arguing against that simple point needs his head examined :) Once we can pass around calibration information from the host to the guest (which we dont do at the moment) there will be reason not to use the native clocksource driver in the guest. in the long run, the paravirt clocksource drivers must become _fallback_ drivers, for the case when the TSC is not perfect. Instead of the current "lets try to make it reliable but also nearly as fast as the TSC", which leads to inevitable breakage on SMP. Ingo ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ^ permalink raw reply [flat|nested] 29+ messages in thread
[parent not found: <20071030073736.GA21843-X9Un+BFzKDI@public.gmane.org>]
* Re: [PATCH] raise tsc clocksource rating [not found] ` <20071030073736.GA21843-X9Un+BFzKDI@public.gmane.org> @ 2007-10-30 10:52 ` Rusty Russell 2007-10-30 12:13 ` Glauber de Oliveira Costa 1 sibling, 0 replies; 29+ messages in thread From: Rusty Russell @ 2007-10-30 10:52 UTC (permalink / raw) To: Ingo Molnar Cc: Jeremy Fitzhardinge, avi-i/T1Zxg06OZWk0Htik3J/w, kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, John Stultz, LKML, Glauber de Oliveira Costa On Tuesday 30 October 2007 18:37:36 Ingo Molnar wrote: > * Rusty Russell <rusty-8n+1lVoiYb80n/F98K4Iww@public.gmane.org> wrote: > > No. tsc is very good, it's not perfect. If a paravirt clock > > registers 400 it really means "pick me over the tsc". > > often the TSC is not perfect, but _IF_ it's perfect, using the paravirt > driver is a pessimisation in performance. > > the main problem at the moment is that there's no mechanism at the > moment to convey to the guest the information that the TSC is "perfect", > and to convey the calibration values. The host can communicate to the guest what clock to use: the guest can decide to register a paravirt clock or not depending on whether it wants to leave it to the TSC. For a while we couldn't remove the TSC cpuid capability in the guest, because if you configured your kernel in some ways it was hardcoded on. I think the "all 686+ have a tsc" assumption has now been fixed, so I should change the lguest clock to do as you said: register its clock at lower prio to the TSC and then the host can simply remove the TSC cpuid if it isn't suitable for the guest to use. ISTR the core TSC timing code (which lguest could use) and various hardware manipulations (which lguest couldn't) were intertwined, but I'll have to go back and check exactly what the issue was. > and just in case it's not obvious: i am not arguing for the inclusion of > the patch Unfortunately, you and Thomas both acked the patch. This says v bad things about how much review kernel patches get. Cheers, Rusty. ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] raise tsc clocksource rating [not found] ` <20071030073736.GA21843-X9Un+BFzKDI@public.gmane.org> 2007-10-30 10:52 ` Rusty Russell @ 2007-10-30 12:13 ` Glauber de Oliveira Costa 1 sibling, 0 replies; 29+ messages in thread From: Glauber de Oliveira Costa @ 2007-10-30 12:13 UTC (permalink / raw) To: Ingo Molnar Cc: Jeremy Fitzhardinge, avi-i/T1Zxg06OZWk0Htik3J/w, kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, John Stultz, LKML, --cc-H+wXaHxf7aLQT0dZR+AlfA -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Ingo Molnar escreveu: > * Rusty Russell <rusty-8n+1lVoiYb80n/F98K4Iww@public.gmane.org> wrote: > > and just in case it's not obvious: i am not arguing for the inclusion of > the patch, i'm just pointing out the plain fact that in the case where > the TSC _is_ reliable, 5 different clocksource drivers for has obvious > disadvantages. Anyone arguing against that simple point needs his head > examined :) Once we can pass around calibration information from the > host to the guest (which we dont do at the moment) there will be reason > not to use the native clocksource driver in the guest. If you sustain that we cannot have a reliable synchronization test mechanism, neither do I. All this is based in the assumption that a bad tsc will fail such tests. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (GNU/Linux) Comment: Using GnuPG with Remi - http://enigmail.mozdev.org iD8DBQFHJyAHjYI8LaFUWXMRAtYoAJ9l5kNodRLfTsNHDvyioufM7SQzTACfarEy XXq3sDq9uxZ/72hhA46YmgM= =DwN/ -----END PGP SIGNATURE----- ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] raise tsc clocksource rating [not found] ` <11936994092607-git-send-email-gcosta-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2007-10-29 22:17 ` Thomas Gleixner @ 2007-10-29 22:42 ` Zachary Amsden [not found] ` <1193697734.9793.86.camel-cxY/u30q8FloTgUnLF1by8fTvwmfpRNyZeezCHUQhQ4@public.gmane.org> 2007-10-29 22:48 ` Ingo Molnar 2007-10-30 0:17 ` H. Peter Anvin 2 siblings, 2 replies; 29+ messages in thread From: Zachary Amsden @ 2007-10-29 22:42 UTC (permalink / raw) To: Glauber de Oliveira Costa Cc: jeremy-TSDbQ3PG+2Y, avi-i/T1Zxg06OZWk0Htik3J/w, kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, Dan Hecht, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Glauber de Oliveira Costa, --cc-H+wXaHxf7aLQT0dZR+AlfA, Garrett Smith On Mon, 2007-10-29 at 20:10 -0300, Glauber de Oliveira Costa wrote: > From: Glauber de Oliveira Costa <glauber-aMxvdZfKFXomYvmMESoHnA@public.gmane.org> > > tsc is very good time source (when it does not have drifts, does not > change it's frequency, i.e. when it works), so it should have its rating > raised to a value greater than, or equal 400. > > Since it's being a tendency among paravirt clocksources to use values > around 400, we should declare tsc as even better: So we use 500. Why is the TSC better than a paravirt clocksource? In our case this is definitely inaccurate. Paravirt clocksources should be preferred to TSC, and both must be made available in hardware for platforms which do not support paravirt. Also, please cc all the paravirt developers on things related to paravirt, especially things with such broad effect. I think 400 is a good value for a perfect native clocksource. >400 should be reserved for super-real (i.e. paravirt) sources that should always be chosen over a hardware realistic implementation in a virtual environment. Zach ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ^ permalink raw reply [flat|nested] 29+ messages in thread
[parent not found: <1193697734.9793.86.camel-cxY/u30q8FloTgUnLF1by8fTvwmfpRNyZeezCHUQhQ4@public.gmane.org>]
* Re: [PATCH] raise tsc clocksource rating [not found] ` <1193697734.9793.86.camel-cxY/u30q8FloTgUnLF1by8fTvwmfpRNyZeezCHUQhQ4@public.gmane.org> @ 2007-10-29 22:45 ` Jeremy Fitzhardinge 0 siblings, 0 replies; 29+ messages in thread From: Jeremy Fitzhardinge @ 2007-10-29 22:45 UTC (permalink / raw) To: Zachary Amsden Cc: avi-i/T1Zxg06OZWk0Htik3J/w, kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, Dan Hecht, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Glauber de Oliveira Costa, Glauber de Oliveira Costa, --cc-H+wXaHxf7aLQT0dZR+AlfA, Garrett Smith Zachary Amsden wrote: > On Mon, 2007-10-29 at 20:10 -0300, Glauber de Oliveira Costa wrote: > >> From: Glauber de Oliveira Costa <glauber-aMxvdZfKFXomYvmMESoHnA@public.gmane.org> >> >> tsc is very good time source (when it does not have drifts, does not >> change it's frequency, i.e. when it works), so it should have its rating >> raised to a value greater than, or equal 400. >> >> Since it's being a tendency among paravirt clocksources to use values >> around 400, we should declare tsc as even better: So we use 500. >> > > Why is the TSC better than a paravirt clocksource? In our case this is > definitely inaccurate. Paravirt clocksources should be preferred to > TSC, and both must be made available in hardware for platforms which do > not support paravirt. > > Also, please cc all the paravirt developers on things related to > paravirt, especially things with such broad effect. I think 400 is a > good value for a perfect native clocksource. >400 should be reserved > for super-real (i.e. paravirt) sources that should always be chosen over > a hardware realistic implementation in a virtual environment. > Yes, agreed. The tsc is never the right thing to use if there's a paravirt clocksource available. What's wrong with rating it 300? What inferior clocksource does it lose out to? Shouldn't that clocksource be lowered? (Why don't we just use 1 to 10?) J ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] raise tsc clocksource rating 2007-10-29 22:42 ` Zachary Amsden [not found] ` <1193697734.9793.86.camel-cxY/u30q8FloTgUnLF1by8fTvwmfpRNyZeezCHUQhQ4@public.gmane.org> @ 2007-10-29 22:48 ` Ingo Molnar [not found] ` <20071029224852.GA27547-X9Un+BFzKDI@public.gmane.org> 1 sibling, 1 reply; 29+ messages in thread From: Ingo Molnar @ 2007-10-29 22:48 UTC (permalink / raw) To: Zachary Amsden Cc: Glauber de Oliveira Costa, linux-kernel, tglx, rusty, jeremy, --cc, avi, kvm-devel, Glauber de Oliveira Costa, Dan Hecht, Garrett Smith * Zachary Amsden <zach@vmware.com> wrote: > On Mon, 2007-10-29 at 20:10 -0300, Glauber de Oliveira Costa wrote: > > From: Glauber de Oliveira Costa <glauber@t60.localdomain> > > > > tsc is very good time source (when it does not have drifts, does not > > change it's frequency, i.e. when it works), so it should have its rating > > raised to a value greater than, or equal 400. > > > > Since it's being a tendency among paravirt clocksources to use values > > around 400, we should declare tsc as even better: So we use 500. > > Why is the TSC better than a paravirt clocksource? In our case this > is definitely inaccurate. Paravirt clocksources should be preferred > to TSC, and both must be made available in hardware for platforms > which do not support paravirt. if it's inaccurate why are you exposing it to the guest then? Native only uses the TSC if it's safe and accurate to do so. Ingo ^ permalink raw reply [flat|nested] 29+ messages in thread
[parent not found: <20071029224852.GA27547-X9Un+BFzKDI@public.gmane.org>]
* Re: [PATCH] raise tsc clocksource rating [not found] ` <20071029224852.GA27547-X9Un+BFzKDI@public.gmane.org> @ 2007-10-29 22:52 ` Jeremy Fitzhardinge 2007-10-29 22:55 ` Ingo Molnar 2007-10-29 22:55 ` Zachary Amsden 1 sibling, 1 reply; 29+ messages in thread From: Jeremy Fitzhardinge @ 2007-10-29 22:52 UTC (permalink / raw) To: Ingo Molnar Cc: Zachary Amsden, avi-i/T1Zxg06OZWk0Htik3J/w, kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, Dan Hecht, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Glauber de Oliveira Costa, Glauber de Oliveira Costa, --cc-H+wXaHxf7aLQT0dZR+AlfA, Garrett Smith Ingo Molnar wrote: > * Zachary Amsden <zach-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org> wrote: > > >> On Mon, 2007-10-29 at 20:10 -0300, Glauber de Oliveira Costa wrote: >> >>> From: Glauber de Oliveira Costa <glauber-aMxvdZfKFXomYvmMESoHnA@public.gmane.org> >>> >>> tsc is very good time source (when it does not have drifts, does not >>> change it's frequency, i.e. when it works), so it should have its rating >>> raised to a value greater than, or equal 400. >>> >>> Since it's being a tendency among paravirt clocksources to use values >>> around 400, we should declare tsc as even better: So we use 500. >>> >> Why is the TSC better than a paravirt clocksource? In our case this >> is definitely inaccurate. Paravirt clocksources should be preferred >> to TSC, and both must be made available in hardware for platforms >> which do not support paravirt. >> > > if it's inaccurate why are you exposing it to the guest then? Native > only uses the TSC if it's safe and accurate to do so. > It is used as part of the Xen clocksource as a short term extrapolator, with correction parameters supplied by the hypervisor. It should never be used directly. J ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] raise tsc clocksource rating 2007-10-29 22:52 ` Jeremy Fitzhardinge @ 2007-10-29 22:55 ` Ingo Molnar [not found] ` <20071029225528.GA31293-X9Un+BFzKDI@public.gmane.org> 0 siblings, 1 reply; 29+ messages in thread From: Ingo Molnar @ 2007-10-29 22:55 UTC (permalink / raw) To: Jeremy Fitzhardinge Cc: Zachary Amsden, Glauber de Oliveira Costa, linux-kernel, tglx, rusty, --cc, avi, kvm-devel, Glauber de Oliveira Costa, Dan Hecht, Garrett Smith * Jeremy Fitzhardinge <jeremy@goop.org> wrote: > > if it's inaccurate why are you exposing it to the guest then? Native > > only uses the TSC if it's safe and accurate to do so. > > It is used as part of the Xen clocksource as a short term > extrapolator, with correction parameters supplied by the hypervisor. > It should never be used directly. that's totally broken then. You cannot create an SMP-safe monotonic clocksource via interpolation - native does not do it either. Good thing this problem got exposed, it needs to be fixed. Ingo ^ permalink raw reply [flat|nested] 29+ messages in thread
[parent not found: <20071029225528.GA31293-X9Un+BFzKDI@public.gmane.org>]
* Re: [PATCH] raise tsc clocksource rating [not found] ` <20071029225528.GA31293-X9Un+BFzKDI@public.gmane.org> @ 2007-10-29 23:17 ` Jeremy Fitzhardinge [not found] ` <47266A00.4040408-TSDbQ3PG+2Y@public.gmane.org> 0 siblings, 1 reply; 29+ messages in thread From: Jeremy Fitzhardinge @ 2007-10-29 23:17 UTC (permalink / raw) To: Ingo Molnar Cc: Zachary Amsden, kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, Dan Hecht, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Avi Kivity, Glauber de Oliveira Costa, Glauber de Oliveira Costa, Garrett Smith Ingo Molnar wrote: > that's totally broken then. You cannot create an SMP-safe monotonic > clocksource via interpolation - native does not do it either. Good thing > this problem got exposed, it needs to be fixed. > Sigh, I don't really want to have this fight again. I don't really see what point there is in raising the tsc's rating (why is 300 insufficient again?), but regardless of the value, the Xen clocksource rating needs to be higher. J ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ^ permalink raw reply [flat|nested] 29+ messages in thread
[parent not found: <47266A00.4040408-TSDbQ3PG+2Y@public.gmane.org>]
* Re: [PATCH] raise tsc clocksource rating [not found] ` <47266A00.4040408-TSDbQ3PG+2Y@public.gmane.org> @ 2007-10-29 23:21 ` Ingo Molnar [not found] ` <20071029232140.GA10518-X9Un+BFzKDI@public.gmane.org> 2007-10-30 0:45 ` Ian Pratt 0 siblings, 2 replies; 29+ messages in thread From: Ingo Molnar @ 2007-10-29 23:21 UTC (permalink / raw) To: Jeremy Fitzhardinge Cc: Zachary Amsden, kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, Dan Hecht, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Avi Kivity, Glauber de Oliveira Costa, Glauber de Oliveira Costa, Garrett Smith * Jeremy Fitzhardinge <jeremy-TSDbQ3PG+2Y@public.gmane.org> wrote: > Ingo Molnar wrote: > > that's totally broken then. You cannot create an SMP-safe monotonic > > clocksource via interpolation - native does not do it either. Good thing > > this problem got exposed, it needs to be fixed. > > > > Sigh, I don't really want to have this fight again. i dont remember us having discussed this before, ever. If there's any "fight" about monotonicity and SMP then it would be a pretty onesided affair, with you being beaten up seriously ;-) > I don't really see what point there is in raising the tsc's rating > (why is 300 insufficient again?), but regardless of the value, the Xen > clocksource rating needs to be higher. anyway, i agree that this patch cannot go in in its current form. Ingo ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ^ permalink raw reply [flat|nested] 29+ messages in thread
[parent not found: <20071029232140.GA10518-X9Un+BFzKDI@public.gmane.org>]
* Re: [PATCH] raise tsc clocksource rating [not found] ` <20071029232140.GA10518-X9Un+BFzKDI@public.gmane.org> @ 2007-10-29 23:33 ` Jeremy Fitzhardinge 0 siblings, 0 replies; 29+ messages in thread From: Jeremy Fitzhardinge @ 2007-10-29 23:33 UTC (permalink / raw) To: Ingo Molnar Cc: Zachary Amsden, kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, Dan Hecht, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Avi Kivity, Glauber de Oliveira Costa, Glauber de Oliveira Costa, Garrett Smith Ingo Molnar wrote: > i dont remember us having discussed this before, ever. If there's any > "fight" about monotonicity and SMP then it would be a pretty onesided > affair, with you being beaten up seriously ;-) > This is part of Xen's ABI, so it isn't easily changed. You're right that getting monotonicity is a pretty ugly affair of doing something like "if (now < previous_return) return ++previous_return;", but its better than using the tsc. (Last time around, it was "Xen can't use the tsc because it can't ever be used as a stable timebase" - though I don't think that was you specifically.) J ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] raise tsc clocksource rating 2007-10-29 23:21 ` Ingo Molnar [not found] ` <20071029232140.GA10518-X9Un+BFzKDI@public.gmane.org> @ 2007-10-30 0:45 ` Ian Pratt [not found] ` <8A87A9A84C201449A0C56B728ACF491E34C5F5-sZfJFUT1i4lUWNNKjvdVWtlxNDNUtst69dF7HbQ/qKg@public.gmane.org> 1 sibling, 1 reply; 29+ messages in thread From: Ian Pratt @ 2007-10-30 0:45 UTC (permalink / raw) To: Ingo Molnar, Jeremy Fitzhardinge Cc: Zachary Amsden, ian.pratt-kDbDZe0LBGWFxr2TtlUqVg, kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, Dan Hecht, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Avi Kivity, Glauber de Oliveira Costa, Glauber de Oliveira Costa, Garrett Smith > > Sigh, I don't really want to have this fight again. > > i dont remember us having discussed this before, ever. If there's any > "fight" about monotonicity and SMP then it would be a pretty onesided > affair, with you being beaten up seriously ;-) Actually, it is possible, even for NUMA systems with CPUs running off completely different oscillators, and in the presence of CPU frequency changes, power management, and even in the presence of thermal throttling (though the latter introduces temporary inaccuracies it doesn't affect monotonicity or rate). Take a look at the Xen code to see how each physical CPU is independently calibrated on an ongoing basis, how movement of VCPUs between physical CPUs is tracked, and how shared variables are used to ensure montonicity if a guest requires it. The fixed-rate TSCs on newer CPUs make some of this stuff easier, but you still need to cope with different source oscillators and some power management states. Ian > > I don't really see what point there is in raising the tsc's rating > > (why is 300 insufficient again?), but regardless of the value, the Xen > > clocksource rating needs to be higher. > > anyway, i agree that this patch cannot go in in its current form. > > Ingo > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" > in > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ^ permalink raw reply [flat|nested] 29+ messages in thread
[parent not found: <8A87A9A84C201449A0C56B728ACF491E34C5F5-sZfJFUT1i4lUWNNKjvdVWtlxNDNUtst69dF7HbQ/qKg@public.gmane.org>]
* Re: [PATCH] raise tsc clocksource rating [not found] ` <8A87A9A84C201449A0C56B728ACF491E34C5F5-sZfJFUT1i4lUWNNKjvdVWtlxNDNUtst69dF7HbQ/qKg@public.gmane.org> @ 2007-10-30 7:19 ` Ingo Molnar 0 siblings, 0 replies; 29+ messages in thread From: Ingo Molnar @ 2007-10-30 7:19 UTC (permalink / raw) To: Ian Pratt Cc: Zachary Amsden, Jeremy Fitzhardinge, kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, Dan Hecht, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Avi Kivity, Glauber de Oliveira Costa, Glauber de Oliveira Costa, Garrett Smith * Ian Pratt <Ian.Pratt-kDbDZe0LBGWFxr2TtlUqVg@public.gmane.org> wrote: > > > Sigh, I don't really want to have this fight again. > > > > i dont remember us having discussed this before, ever. If there's any > > "fight" about monotonicity and SMP then it would be a pretty onesided > > affair, with you being beaten up seriously ;-) > > Actually, it is possible, even for NUMA systems with CPUs running off > completely different oscillators, and in the presence of CPU frequency > changes, power management, and even in the presence of thermal > throttling (though the latter introduces temporary inaccuracies it > doesn't affect monotonicity or rate). > > Take a look at the Xen code to see how each physical CPU is > independently calibrated on an ongoing basis, how movement of VCPUs > between physical CPUs is tracked, and how shared variables are used to > ensure montonicity if a guest requires it. I think that's wishful thinking. Check out: http://people.redhat.com/mingo/time-warp-test/time-warp-test.c change TEST_TSC to 0, run it on an SMP guest (on a reasonably fast machine) and let me know whether you can make SMP guests not come up with monotonicity violations in the CLOCK tests. Ingo ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] raise tsc clocksource rating [not found] ` <20071029224852.GA27547-X9Un+BFzKDI@public.gmane.org> 2007-10-29 22:52 ` Jeremy Fitzhardinge @ 2007-10-29 22:55 ` Zachary Amsden 2007-10-29 23:02 ` Ingo Molnar [not found] ` <1193698505.9793.90.camel-cxY/u30q8FloTgUnLF1by8fTvwmfpRNyZeezCHUQhQ4@public.gmane.org> 1 sibling, 2 replies; 29+ messages in thread From: Zachary Amsden @ 2007-10-29 22:55 UTC (permalink / raw) To: Ingo Molnar Cc: jeremy-TSDbQ3PG+2Y, avi-i/T1Zxg06OZWk0Htik3J/w, kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, Dan Hecht, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Glauber de Oliveira Costa, Glauber de Oliveira Costa, --cc-H+wXaHxf7aLQT0dZR+AlfA, Garrett Smith On Mon, 2007-10-29 at 23:48 +0100, Ingo Molnar wrote: > * Zachary Amsden <zach-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org> wrote: > if it's inaccurate why are you exposing it to the guest then? Native > only uses the TSC if it's safe and accurate to do so. Not every guest support paravirt, but for correctness, all guests require TSC, which must be exposed all the way up to userspace, no matter what the efficiency or accuracy may be. Zach ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] raise tsc clocksource rating 2007-10-29 22:55 ` Zachary Amsden @ 2007-10-29 23:02 ` Ingo Molnar 2007-10-29 23:13 ` Zachary Amsden [not found] ` <20071029230213.GA1982-X9Un+BFzKDI@public.gmane.org> [not found] ` <1193698505.9793.90.camel-cxY/u30q8FloTgUnLF1by8fTvwmfpRNyZeezCHUQhQ4@public.gmane.org> 1 sibling, 2 replies; 29+ messages in thread From: Ingo Molnar @ 2007-10-29 23:02 UTC (permalink / raw) To: Zachary Amsden Cc: Glauber de Oliveira Costa, linux-kernel, tglx, rusty, jeremy, --cc, avi, kvm-devel, Glauber de Oliveira Costa, Dan Hecht, Garrett Smith * Zachary Amsden <zach@vmware.com> wrote: > On Mon, 2007-10-29 at 23:48 +0100, Ingo Molnar wrote: > > * Zachary Amsden <zach@vmware.com> wrote: > > > if it's inaccurate why are you exposing it to the guest then? Native > > only uses the TSC if it's safe and accurate to do so. > > Not every guest support paravirt, but for correctness, all guests > require TSC, which must be exposed all the way up to userspace, no > matter what the efficiency or accuracy may be. but if there's a perfect TSC available (there is such hardware) then the TSC _is_ the best clocksource. Paravirt now turns it off unconditionally in essence. anyway, that's at most an optimization issue. No strong feelings here, and we can certainly delay this patch until this gets all sorted out. Ingo ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] raise tsc clocksource rating 2007-10-29 23:02 ` Ingo Molnar @ 2007-10-29 23:13 ` Zachary Amsden [not found] ` <1193699638.9793.97.camel-cxY/u30q8FloTgUnLF1by8fTvwmfpRNyZeezCHUQhQ4@public.gmane.org> [not found] ` <20071029230213.GA1982-X9Un+BFzKDI@public.gmane.org> 1 sibling, 1 reply; 29+ messages in thread From: Zachary Amsden @ 2007-10-29 23:13 UTC (permalink / raw) To: Ingo Molnar Cc: Glauber de Oliveira Costa, linux-kernel, tglx, rusty, jeremy, --cc, avi, kvm-devel, Glauber de Oliveira Costa, Dan Hecht, Garrett Smith On Tue, 2007-10-30 at 00:02 +0100, Ingo Molnar wrote: > * Zachary Amsden <zach@vmware.com> wrote: > > Not every guest support paravirt, but for correctness, all guests > > require TSC, which must be exposed all the way up to userspace, no > > matter what the efficiency or accuracy may be. > > but if there's a perfect TSC available (there is such hardware) then the > TSC _is_ the best clocksource. Paravirt now turns it off unconditionally > in essence. No, if no paravirt clocksource is detected, nothing can override the perfect TSC hardware clocksource rating of 400. And if a paravirt clocksource is detected, it is always better than TSC. > anyway, that's at most an optimization issue. No strong feelings here, > and we can certainly delay this patch until this gets all sorted out. This patch should be nacked, since it is just wrong. This is not an optimization issue. It is an accuracy issue for all virtualization environments that affects long term kernel clock stability, which is important to fix, and the best way to do that is to use a paravirt clocksource. Zach ^ permalink raw reply [flat|nested] 29+ messages in thread
[parent not found: <1193699638.9793.97.camel-cxY/u30q8FloTgUnLF1by8fTvwmfpRNyZeezCHUQhQ4@public.gmane.org>]
* Re: [PATCH] raise tsc clocksource rating [not found] ` <1193699638.9793.97.camel-cxY/u30q8FloTgUnLF1by8fTvwmfpRNyZeezCHUQhQ4@public.gmane.org> @ 2007-10-29 23:17 ` Ingo Molnar 2007-10-30 12:02 ` Glauber de Oliveira Costa 1 sibling, 0 replies; 29+ messages in thread From: Ingo Molnar @ 2007-10-29 23:17 UTC (permalink / raw) To: Zachary Amsden Cc: jeremy-TSDbQ3PG+2Y, avi-i/T1Zxg06OZWk0Htik3J/w, kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, Dan Hecht, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Glauber de Oliveira Costa, Glauber de Oliveira Costa, --cc-H+wXaHxf7aLQT0dZR+AlfA, Garrett Smith * Zachary Amsden <zach-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org> wrote: > On Tue, 2007-10-30 at 00:02 +0100, Ingo Molnar wrote: > > * Zachary Amsden <zach-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org> wrote: > > > > Not every guest support paravirt, but for correctness, all guests > > > require TSC, which must be exposed all the way up to userspace, no > > > matter what the efficiency or accuracy may be. > > > > but if there's a perfect TSC available (there is such hardware) then the > > TSC _is_ the best clocksource. Paravirt now turns it off unconditionally > > in essence. > > No, if no paravirt clocksource is detected, nothing can override the > perfect TSC hardware clocksource rating of 400. And if a paravirt > clocksource is detected, it is always better than TSC. > > > anyway, that's at most an optimization issue. No strong feelings here, > > and we can certainly delay this patch until this gets all sorted out. > > This patch should be nacked, since it is just wrong. This is not an > optimization issue. It is an accuracy issue for all virtualization > environments that affects long term kernel clock stability, which is > important to fix, and the best way to do that is to use a paravirt > clocksource. i know it's not an optimization issue. Your current pessimisation of even perfect TSCs _is_ an optimization issue. (and note that if the TSC is unstable the guest will/should notice it and will fall back to the hyper clocksource) Ingo ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] raise tsc clocksource rating [not found] ` <1193699638.9793.97.camel-cxY/u30q8FloTgUnLF1by8fTvwmfpRNyZeezCHUQhQ4@public.gmane.org> 2007-10-29 23:17 ` Ingo Molnar @ 2007-10-30 12:02 ` Glauber de Oliveira Costa [not found] ` <47271D4A.8030707-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 1 sibling, 1 reply; 29+ messages in thread From: Glauber de Oliveira Costa @ 2007-10-30 12:02 UTC (permalink / raw) To: Zachary Amsden Cc: jeremy-TSDbQ3PG+2Y, avi-i/T1Zxg06OZWk0Htik3J/w, kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, Dan Hecht, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Glauber de Oliveira Costa, --cc-H+wXaHxf7aLQT0dZR+AlfA, Garrett Smith -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Zachary Amsden escreveu: > On Tue, 2007-10-30 at 00:02 +0100, Ingo Molnar wrote: >> * Zachary Amsden <zach-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org> wrote: > >>> Not every guest support paravirt, but for correctness, all guests >>> require TSC, which must be exposed all the way up to userspace, no >>> matter what the efficiency or accuracy may be. >> but if there's a perfect TSC available (there is such hardware) then the >> TSC _is_ the best clocksource. Paravirt now turns it off unconditionally >> in essence. > > No, if no paravirt clocksource is detected, nothing can override the > perfect TSC hardware clocksource rating of 400. And if a paravirt > clocksource is detected, it is always better than TSC. Why always? tsc is the best possible alternative the _platform_ can provide. So the paravirt clocksource will be at best, as good as tsc. And if it is the case: why bother with communication mechanisms of all kinds, calculations, etc, if we can just read the tsc? Noting again: If the tsc does not arrive accurate to the guest, it will fail the tsc clocksource tests. So it will be rated zero anyway -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (GNU/Linux) Comment: Using GnuPG with Remi - http://enigmail.mozdev.org iD8DBQFHJx1KjYI8LaFUWXMRAv0hAJ4sj0Z1FraYrgbU5Mj0pWOJGk6jtwCfc5xL jpTC273X0oqPTCR7NcVHJOI= =WPzV -----END PGP SIGNATURE----- ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ^ permalink raw reply [flat|nested] 29+ messages in thread
[parent not found: <47271D4A.8030707-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH] raise tsc clocksource rating [not found] ` <47271D4A.8030707-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2007-10-30 17:58 ` Zachary Amsden 0 siblings, 0 replies; 29+ messages in thread From: Zachary Amsden @ 2007-10-30 17:58 UTC (permalink / raw) To: Glauber de Oliveira Costa Cc: jeremy-TSDbQ3PG+2Y, avi-i/T1Zxg06OZWk0Htik3J/w, kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, Dan Hecht, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Glauber de Oliveira Costa, --cc-H+wXaHxf7aLQT0dZR+AlfA, Garrett Smith On Tue, 2007-10-30 at 10:02 -0200, Glauber de Oliveira Costa wrote: > > No, if no paravirt clocksource is detected, nothing can override the > > perfect TSC hardware clocksource rating of 400. And if a paravirt > > clocksource is detected, it is always better than TSC. > > Why always? tsc is the best possible alternative the _platform_ can > provide. So the paravirt clocksource will be at best, as good as tsc. What is the justification for this claim? The TSC provides no information about frequency. The frequency must be measured from a known fixed rate source. This measurement is error prone, both on real hardware, and even more so in a VM. The TSC can only provide a single measure of elapsed periods. It has no concept to differentiate between elapsed periods of time spent in the guest, which is relevant to the scheduler, and elapsed periods of time spent in real time, which is relevant to timekeeping. So a TSC can not drive both timekeeping and scheduling in a virtual machine, or if it does, it does so inaccurately. > And if it is the case: why bother with communication mechanisms of all > kinds, calculations, etc, if we can just read the tsc? > > Noting again: If the tsc does not arrive accurate to the guest, it will > fail the tsc clocksource tests. So it will be rated zero anyway You always need to provide a TSC, and it will always initially appear to be accurate. Long term, as time is reported via the TSC is an approximation somewhere between real time and elapsed running time, measurements made using it in a VM will drift, and it is impossible to remove the inaccuracy from one end of the spectrum (by providing real time exactly) without compromising it at the other end (by skewing relative time measurement). A paravirt clocksource is always better than a TSC because it compensates for these effects. Zach ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ^ permalink raw reply [flat|nested] 29+ messages in thread
[parent not found: <20071029230213.GA1982-X9Un+BFzKDI@public.gmane.org>]
* Re: [PATCH] raise tsc clocksource rating [not found] ` <20071029230213.GA1982-X9Un+BFzKDI@public.gmane.org> @ 2007-10-29 23:24 ` Dan Hecht [not found] ` <47266B90.8000008-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org> 0 siblings, 1 reply; 29+ messages in thread From: Dan Hecht @ 2007-10-29 23:24 UTC (permalink / raw) To: Ingo Molnar Cc: Zachary Amsden, jeremy-TSDbQ3PG+2Y, avi-i/T1Zxg06OZWk0Htik3J/w, kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Glauber de Oliveira Costa, Glauber de Oliveira Costa, --cc-H+wXaHxf7aLQT0dZR+AlfA, Garrett Smith On 10/29/2007 04:02 PM, Ingo Molnar wrote: > * Zachary Amsden <zach-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org> wrote: > >> On Mon, 2007-10-29 at 23:48 +0100, Ingo Molnar wrote: >>> * Zachary Amsden <zach-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org> wrote: >>> if it's inaccurate why are you exposing it to the guest then? Native >>> only uses the TSC if it's safe and accurate to do so. >> Not every guest support paravirt, but for correctness, all guests >> require TSC, which must be exposed all the way up to userspace, no >> matter what the efficiency or accuracy may be. > > but if there's a perfect TSC available (there is such hardware) then the > TSC _is_ the best clocksource. Paravirt now turns it off unconditionally > in essence. > Not really. In the case hardware TSC is perfect, the paravirt time counter can be implemented directly in terms of hardware TSC; there is no loss in optimization. This is done transparently. And virtual TSC can be implemented this way too. The real improvement that a paravirt clocksource offers over the TSC clocksource is that the guest does not need to measure the TSC frequency itself against some other constant frequency source (which is problematic on a virtual machine). Instead, the paravirt clocksource queries the hypervisor for the frequency of the counter. As you know, with clocksource style kernels, it's important to get this frequency correct, or else the guest will have long-term time drift. ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ^ permalink raw reply [flat|nested] 29+ messages in thread
[parent not found: <47266B90.8000008-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH] raise tsc clocksource rating [not found] ` <47266B90.8000008-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org> @ 2007-10-30 4:24 ` Avi Kivity 2007-10-30 7:14 ` Ingo Molnar 1 sibling, 0 replies; 29+ messages in thread From: Avi Kivity @ 2007-10-30 4:24 UTC (permalink / raw) To: Dan Hecht Cc: Zachary Amsden, jeremy-TSDbQ3PG+2Y, avi-i/T1Zxg06OZWk0Htik3J/w, kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, Glauber de Oliveira Costa, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Glauber de Oliveira Costa, --cc-H+wXaHxf7aLQT0dZR+AlfA, Garrett Smith Dan Hecht wrote: > Not really. In the case hardware TSC is perfect, the paravirt time > counter can be implemented directly in terms of hardware TSC; there is > no loss in optimization. This is done transparently. And virtual TSC > can be implemented this way too. > > The real improvement that a paravirt clocksource offers over the TSC > clocksource is that the guest does not need to measure the TSC frequency > itself against some other constant frequency source (which is > problematic on a virtual machine). Instead, the paravirt clocksource > queries the hypervisor for the frequency of the counter. As you know, > with clocksource style kernels, it's important to get this frequency > correct, or else the guest will have long-term time drift. > > In addition, a paravirt clocksource can compensate for events like vcpu migration to another host cpu. So I agree: a paravirt clocksource is always better than or equal to the tsc. -- Any sufficiently difficult bug is indistinguishable from a feature. ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] raise tsc clocksource rating [not found] ` <47266B90.8000008-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org> 2007-10-30 4:24 ` Avi Kivity @ 2007-10-30 7:14 ` Ingo Molnar 1 sibling, 0 replies; 29+ messages in thread From: Ingo Molnar @ 2007-10-30 7:14 UTC (permalink / raw) To: Dan Hecht Cc: Zachary Amsden, jeremy-TSDbQ3PG+2Y, avi-i/T1Zxg06OZWk0Htik3J/w, kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Glauber de Oliveira Costa, Glauber de Oliveira Costa, --cc-H+wXaHxf7aLQT0dZR+AlfA, Garrett Smith * Dan Hecht <dhecht-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org> wrote: >> but if there's a perfect TSC available (there is such hardware) then >> the TSC _is_ the best clocksource. Paravirt now turns it off >> unconditionally in essence. > > Not really. In the case hardware TSC is perfect, the paravirt time > counter can be implemented directly in terms of hardware TSC; there is > no loss in optimization. This is done transparently. And virtual TSC > can be implemented this way too. Of course if you duplicate all (or part) of the TSC clocksource driver in the paravirt guest code then the "paravirt clocksource" is at least as good as the TSC. But that argument is playing word-games, _of course_ if you use the same (or similar) code it's at least as good. The real question are clocksources that communicate out to the hypervisor, and hence have higher overhead than a native, TSC based clocksource - and clocksources that use the TSC in a broken way. > The real improvement that a paravirt clocksource offers over the TSC > clocksource is that the guest does not need to measure the TSC > frequency itself against some other constant frequency source (which > is problematic on a virtual machine). [...] hey, you need not tell me, i've implemented a hyper-clocksource driver myself. But calibration is a boot only issue and there's no reason why calibration _has_ to be fragile. For example we could easily extend the TSC clocksource driver to not calibrate in the guest but take calibration information from the host. It's in essence a trivial and obvious extension to calibration. That way we get the highest possible performance _and_ we share much of the clocksource driver with the host. also, the way the TSC is used by guests like Xen is fundamentally fragile on SMP. So i have a good reason to distrust the approach of hypervisors to timekeeping. The maintenance problem to me is that everyone in the paravirt space is busy coding away in their own (often broken) direction, replicating the essence of the TSC clocksource driver 4 times over again and again, with subtle bugs in each variant, even in cases where the TSC readout can be trusted perfectly well. "Consolidation" and "sharing code" is not a particularly strong point of the paravirt projects ;-) (ok, KVM is a notable exception there.) anyway, i do agree that this patch is wrong currently, mainly due to TSC calibration not being reliable in guest-space at the moment - but the whole concept of putting a separate clocksource driver into each paravirt guest, even in the case where the TSC is perfect, is madness. That code, once the hardware gets sane (and there are good signs for that), and once calibration can be passed from host to guest reliably, _will_ be consolidated, because it makes perfect technical sense. Ingo ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ^ permalink raw reply [flat|nested] 29+ messages in thread
[parent not found: <1193698505.9793.90.camel-cxY/u30q8FloTgUnLF1by8fTvwmfpRNyZeezCHUQhQ4@public.gmane.org>]
* Re: [PATCH] raise tsc clocksource rating [not found] ` <1193698505.9793.90.camel-cxY/u30q8FloTgUnLF1by8fTvwmfpRNyZeezCHUQhQ4@public.gmane.org> @ 2007-10-30 11:59 ` Glauber de Oliveira Costa 0 siblings, 0 replies; 29+ messages in thread From: Glauber de Oliveira Costa @ 2007-10-30 11:59 UTC (permalink / raw) To: Zachary Amsden Cc: jeremy-TSDbQ3PG+2Y, avi-i/T1Zxg06OZWk0Htik3J/w, kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, Dan Hecht, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Glauber de Oliveira Costa, --cc-H+wXaHxf7aLQT0dZR+AlfA, Garrett Smith -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Zachary Amsden escreveu: > On Mon, 2007-10-29 at 23:48 +0100, Ingo Molnar wrote: >> * Zachary Amsden <zach-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org> wrote: > >> if it's inaccurate why are you exposing it to the guest then? Native >> only uses the TSC if it's safe and accurate to do so. > > Not every guest support paravirt, but for correctness, all guests > require TSC, which must be exposed all the way up to userspace, no > matter what the efficiency or accuracy may be. > > Zach > However, with such accuracy, it will fail the tsc clocksources tests. If it passes, it'a good clocksource to use. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (GNU/Linux) Comment: Using GnuPG with Remi - http://enigmail.mozdev.org iD8DBQFHJxyujYI8LaFUWXMRAhI5AKCDl/O7iA3TdtpvElVg8NoSDVfKpgCeNvz3 ZEZDMxg8T7FA2R+5cgH/uKI= =8qSH -----END PGP SIGNATURE----- ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] raise tsc clocksource rating [not found] ` <11936994092607-git-send-email-gcosta-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2007-10-29 22:17 ` Thomas Gleixner 2007-10-29 22:42 ` Zachary Amsden @ 2007-10-30 0:17 ` H. Peter Anvin 2 siblings, 0 replies; 29+ messages in thread From: H. Peter Anvin @ 2007-10-30 0:17 UTC (permalink / raw) To: Glauber de Oliveira Costa Cc: jeremy-TSDbQ3PG+2Y, avi-i/T1Zxg06OZWk0Htik3J/w, kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Glauber de Oliveira Costa, --cc-H+wXaHxf7aLQT0dZR+AlfA Glauber de Oliveira Costa wrote: > From: Glauber de Oliveira Costa <glauber-aMxvdZfKFXomYvmMESoHnA@public.gmane.org> > > tsc is very good time source (when it does not have drifts, does not > change it's frequency, i.e. when it works), so it should have its rating > raised to a value greater than, or equal 400. > > Since it's being a tendency among paravirt clocksources to use values > around 400, we should declare tsc as even better: So we use 500. > > This patch also touches the comments on clocksource.h, which suggests > that 499 would be a limit on the rating values. > > Signed-off-by: Glauber de Oliveira Costa <gcosta-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> Are you sure these paravirt sources don't intentionally trump the TSC? -hpa ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ^ permalink raw reply [flat|nested] 29+ messages in thread
end of thread, other threads:[~2007-10-30 17:58 UTC | newest]
Thread overview: 29+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-10-29 23:10 [PATCH] raise tsc clocksource rating Glauber de Oliveira Costa
[not found] ` <11936994092607-git-send-email-gcosta-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2007-10-29 22:17 ` Thomas Gleixner
2007-10-29 22:36 ` Ingo Molnar
2007-10-30 1:26 ` john stultz
[not found] ` <alpine.LFD.0.9999.0710292315280.3186-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2007-10-30 2:39 ` Rusty Russell
[not found] ` <200710301339.23628.rusty-8n+1lVoiYb80n/F98K4Iww@public.gmane.org>
2007-10-30 7:37 ` Ingo Molnar
[not found] ` <20071030073736.GA21843-X9Un+BFzKDI@public.gmane.org>
2007-10-30 10:52 ` Rusty Russell
2007-10-30 12:13 ` Glauber de Oliveira Costa
2007-10-29 22:42 ` Zachary Amsden
[not found] ` <1193697734.9793.86.camel-cxY/u30q8FloTgUnLF1by8fTvwmfpRNyZeezCHUQhQ4@public.gmane.org>
2007-10-29 22:45 ` Jeremy Fitzhardinge
2007-10-29 22:48 ` Ingo Molnar
[not found] ` <20071029224852.GA27547-X9Un+BFzKDI@public.gmane.org>
2007-10-29 22:52 ` Jeremy Fitzhardinge
2007-10-29 22:55 ` Ingo Molnar
[not found] ` <20071029225528.GA31293-X9Un+BFzKDI@public.gmane.org>
2007-10-29 23:17 ` Jeremy Fitzhardinge
[not found] ` <47266A00.4040408-TSDbQ3PG+2Y@public.gmane.org>
2007-10-29 23:21 ` Ingo Molnar
[not found] ` <20071029232140.GA10518-X9Un+BFzKDI@public.gmane.org>
2007-10-29 23:33 ` Jeremy Fitzhardinge
2007-10-30 0:45 ` Ian Pratt
[not found] ` <8A87A9A84C201449A0C56B728ACF491E34C5F5-sZfJFUT1i4lUWNNKjvdVWtlxNDNUtst69dF7HbQ/qKg@public.gmane.org>
2007-10-30 7:19 ` Ingo Molnar
2007-10-29 22:55 ` Zachary Amsden
2007-10-29 23:02 ` Ingo Molnar
2007-10-29 23:13 ` Zachary Amsden
[not found] ` <1193699638.9793.97.camel-cxY/u30q8FloTgUnLF1by8fTvwmfpRNyZeezCHUQhQ4@public.gmane.org>
2007-10-29 23:17 ` Ingo Molnar
2007-10-30 12:02 ` Glauber de Oliveira Costa
[not found] ` <47271D4A.8030707-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2007-10-30 17:58 ` Zachary Amsden
[not found] ` <20071029230213.GA1982-X9Un+BFzKDI@public.gmane.org>
2007-10-29 23:24 ` Dan Hecht
[not found] ` <47266B90.8000008-pghWNbHTmq7QT0dZR+AlfA@public.gmane.org>
2007-10-30 4:24 ` Avi Kivity
2007-10-30 7:14 ` Ingo Molnar
[not found] ` <1193698505.9793.90.camel-cxY/u30q8FloTgUnLF1by8fTvwmfpRNyZeezCHUQhQ4@public.gmane.org>
2007-10-30 11:59 ` Glauber de Oliveira Costa
2007-10-30 0:17 ` H. Peter Anvin
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox