* Re: Scalability enhancements for gettimeofday
2004-05-20 19:05 Scalability enhancements for gettimeofday Christoph Lameter
@ 2004-06-02 17:33 ` David Mosberger
2004-06-02 20:58 ` John Hesterberg
` (23 subsequent siblings)
24 siblings, 0 replies; 26+ messages in thread
From: David Mosberger @ 2004-06-02 17:33 UTC (permalink / raw)
To: linux-ia64
>>>>> On Thu, 20 May 2004 12:05:57 -0700, Christoph Lameter <clameter@sgi.com> said:
Christoph> The Altix systems never use ITC to determine the time so
Christoph> this check and the cmpxchg is not necessary.
I see the problem: the basic problem seems to come from the fact that
the non-ITC-interpolation code didn't quite come out the way I
anticipated: I expected even this code to use/need last_nsec_offset
just like the ITC-based code, but it ended up not doing that. I
_think_ that's OK and, if so, we should probably push all the
last_nsec_offset back down into the interpolator-specific code.
However, this is tricky code so I want to mull it some more to make
sure I'm not missing a subtle race.
Christoph> Also the fast system call handler for sys_gettimeofday()
Christoph> can only improve scalability if the ITC is the basis for
Christoph> gettimeofday which is also never the case on Altix SN2
Christoph> systems.
Does this really make a substantial difference? My guess is no and,
if so, I'd rather not add any more #ifdefs.
--david
^ permalink raw reply [flat|nested] 26+ messages in thread* Re: Scalability enhancements for gettimeofday
2004-05-20 19:05 Scalability enhancements for gettimeofday Christoph Lameter
2004-06-02 17:33 ` David Mosberger
@ 2004-06-02 20:58 ` John Hesterberg
2004-06-03 2:38 ` Jack Steiner
` (22 subsequent siblings)
24 siblings, 0 replies; 26+ messages in thread
From: John Hesterberg @ 2004-06-02 20:58 UTC (permalink / raw)
To: linux-ia64
On Wed, Jun 02, 2004 at 10:33:39AM -0700, David Mosberger wrote:
> Christoph> Also the fast system call handler for sys_gettimeofday()
> Christoph> can only improve scalability if the ITC is the basis for
> Christoph> gettimeofday which is also never the case on Altix SN2
> Christoph> systems.
>
> Does this really make a substantial difference? My guess is no and,
> if so, I'd rather not add any more #ifdefs.
It also only helps SN2 kernels and not generic kernels.
I don't think this is they key part of the patch,
so unless we see some compelling performance evidence,
I'd agree it seems better to leave out the #ifdef.
John
^ permalink raw reply [flat|nested] 26+ messages in thread* Re: Scalability enhancements for gettimeofday
2004-05-20 19:05 Scalability enhancements for gettimeofday Christoph Lameter
2004-06-02 17:33 ` David Mosberger
2004-06-02 20:58 ` John Hesterberg
@ 2004-06-03 2:38 ` Jack Steiner
2004-06-03 4:56 ` Christoph Lameter
` (21 subsequent siblings)
24 siblings, 0 replies; 26+ messages in thread
From: Jack Steiner @ 2004-06-03 2:38 UTC (permalink / raw)
To: linux-ia64
On Wed, Jun 02, 2004 at 03:58:26PM -0500, John Hesterberg wrote:
> On Wed, Jun 02, 2004 at 10:33:39AM -0700, David Mosberger wrote:
> > Christoph> Also the fast system call handler for sys_gettimeofday()
> > Christoph> can only improve scalability if the ITC is the basis for
> > Christoph> gettimeofday which is also never the case on Altix SN2
> > Christoph> systems.
> >
> > Does this really make a substantial difference? My guess is no and,
> > if so, I'd rather not add any more #ifdefs.
>
> It also only helps SN2 kernels and not generic kernels.
> I don't think this is they key part of the patch,
> so unless we see some compelling performance evidence,
> I'd agree it seems better to leave out the #ifdef.
>
> John
What is the limitation in gettimeofday that prevents us from using a
fast system call handler to access the RTC on SN2. I realize that fsys.S
does not currently support this, but is there any fundamental
restriction that makes this impossible to add?
If not, it would seem preferable to support SN2 rather add the #ifdefs.
--
Thanks
Jack Steiner (steiner@sgi.com) 651-683-5302
Principal Engineer SGI - Silicon Graphics, Inc.
^ permalink raw reply [flat|nested] 26+ messages in thread* Re: Scalability enhancements for gettimeofday
2004-05-20 19:05 Scalability enhancements for gettimeofday Christoph Lameter
` (2 preceding siblings ...)
2004-06-03 2:38 ` Jack Steiner
@ 2004-06-03 4:56 ` Christoph Lameter
2004-06-03 16:28 ` Christoph Lameter
` (20 subsequent siblings)
24 siblings, 0 replies; 26+ messages in thread
From: Christoph Lameter @ 2004-06-03 4:56 UTC (permalink / raw)
To: linux-ia64
On Wed, 2 Jun 2004, Jack Steiner wrote:
> What is the limitation in gettimeofday that prevents us from using a
> fast system call handler to access the RTC on SN2. I realize that fsys.S
> does not currently support this, but is there any fundamental
> restriction that makes this impossible to add?
None. Except that I currently have other priorities. But I plan to get
around it once the bigger timer issues are dealt with.
> If not, it would seem preferable to support SN2 rather add the #ifdefs.
Right.
^ permalink raw reply [flat|nested] 26+ messages in thread* Re: Scalability enhancements for gettimeofday
2004-05-20 19:05 Scalability enhancements for gettimeofday Christoph Lameter
` (3 preceding siblings ...)
2004-06-03 4:56 ` Christoph Lameter
@ 2004-06-03 16:28 ` Christoph Lameter
2004-06-03 20:04 ` Chris Wedgwood
` (19 subsequent siblings)
24 siblings, 0 replies; 26+ messages in thread
From: Christoph Lameter @ 2004-06-03 16:28 UTC (permalink / raw)
To: linux-ia64
On Wed, 2 Jun 2004, David Mosberger wrote:
> >>>>> On Thu, 20 May 2004 12:05:57 -0700, Christoph Lameter <clameter@sgi.com> said:
>
> Christoph> The Altix systems never use ITC to determine the time so
> Christoph> this check and the cmpxchg is not necessary.
>
> I see the problem: the basic problem seems to come from the fact that
> the non-ITC-interpolation code didn't quite come out the way I
> anticipated: I expected even this code to use/need last_nsec_offset
> just like the ITC-based code, but it ended up not doing that. I
> _think_ that's OK and, if so, we should probably push all the
> last_nsec_offset back down into the interpolator-specific code.
> However, this is tricky code so I want to mull it some more to make
> sure I'm not missing a subtle race.
I also think that pushing that into the interpolators is probably the
right way. I could work on a patch to do that if you do not have time
for it.
^ permalink raw reply [flat|nested] 26+ messages in thread* Re: Scalability enhancements for gettimeofday
2004-05-20 19:05 Scalability enhancements for gettimeofday Christoph Lameter
` (4 preceding siblings ...)
2004-06-03 16:28 ` Christoph Lameter
@ 2004-06-03 20:04 ` Chris Wedgwood
2004-06-04 6:29 ` David Mosberger
` (18 subsequent siblings)
24 siblings, 0 replies; 26+ messages in thread
From: Chris Wedgwood @ 2004-06-03 20:04 UTC (permalink / raw)
To: linux-ia64
On Wed, Jun 02, 2004 at 10:33:39AM -0700, David Mosberger wrote:
> Does this really make a substantial difference? My guess is no and,
> if so, I'd rather not add any more #ifdefs.
How about have the sn2 init code stomp over this value and point to
some local code as required?
/me hides :)
--cw
^ permalink raw reply [flat|nested] 26+ messages in thread* Re: Scalability enhancements for gettimeofday
2004-05-20 19:05 Scalability enhancements for gettimeofday Christoph Lameter
` (5 preceding siblings ...)
2004-06-03 20:04 ` Chris Wedgwood
@ 2004-06-04 6:29 ` David Mosberger
2004-06-04 14:26 ` Christoph Lameter
` (17 subsequent siblings)
24 siblings, 0 replies; 26+ messages in thread
From: David Mosberger @ 2004-06-04 6:29 UTC (permalink / raw)
To: linux-ia64
>>>>> On Thu, 3 Jun 2004 13:04:12 -0700, Chris Wedgwood <cw@f00f.org> said:
Chris> On Wed, Jun 02, 2004 at 10:33:39AM -0700, David Mosberger wrote:
>> Does this really make a substantial difference? My guess is no and,
>> if so, I'd rather not add any more #ifdefs.
Chris> How about have the sn2 init code stomp over this value and point to
Chris> some local code as required?
Do you know about the "nolwsys" option? It already patches the
fsyscall-table at boot time. Now this isn't something that we should
go overboard with, but in a few select cases, it may indeed be
reasonable (and the patch would have to be limited to boot-time, to
avoid races etc.).
It would, however, be good to share as much source code as possible
between the different implementations. For example, me thinks the
HPET-version of fsys_gettimeofday() would look almost identical to the
SN2-version and some of the existing fsys_gettimeofday() source code
should be sharable as well. So if someone wants to pursue this in
earnest, it might make sense to move fsys_gettimeofday() into a
separate file.
--david
^ permalink raw reply [flat|nested] 26+ messages in thread* Re: Scalability enhancements for gettimeofday
2004-05-20 19:05 Scalability enhancements for gettimeofday Christoph Lameter
` (6 preceding siblings ...)
2004-06-04 6:29 ` David Mosberger
@ 2004-06-04 14:26 ` Christoph Lameter
2004-06-05 5:12 ` David Mosberger
` (16 subsequent siblings)
24 siblings, 0 replies; 26+ messages in thread
From: Christoph Lameter @ 2004-06-04 14:26 UTC (permalink / raw)
To: linux-ia64
On Thu, 3 Jun 2004, David Mosberger wrote:
> Do you know about the "nolwsys" option? It already patches the
> fsyscall-table at boot time. Now this isn't something that we should
> go overboard with, but in a few select cases, it may indeed be
> reasonable (and the patch would have to be limited to boot-time, to
> avoid races etc.).
>
> It would, however, be good to share as much source code as possible
> between the different implementations. For example, me thinks the
> HPET-version of fsys_gettimeofday() would look almost identical to the
> SN2-version and some of the existing fsys_gettimeofday() source code
> should be sharable as well. So if someone wants to pursue this in
> earnest, it might make sense to move fsys_gettimeofday() into a
> separate file.
Ahh. Interesting. This is likely in the scope of my work.
^ permalink raw reply [flat|nested] 26+ messages in thread* Re: Scalability enhancements for gettimeofday
2004-05-20 19:05 Scalability enhancements for gettimeofday Christoph Lameter
` (7 preceding siblings ...)
2004-06-04 14:26 ` Christoph Lameter
@ 2004-06-05 5:12 ` David Mosberger
2004-06-07 17:46 ` Christoph Lameter
` (15 subsequent siblings)
24 siblings, 0 replies; 26+ messages in thread
From: David Mosberger @ 2004-06-05 5:12 UTC (permalink / raw)
To: linux-ia64
>>>>> On Thu, 20 May 2004 12:05:57 -0700, Christoph Lameter <clameter@sgi.com> said:
Christoph> The Altix systems use a global clock as the time source
Christoph> for gettimeofday and not the ITC. In do_gettimeofday() a
Christoph> cmpxchg instruction is used to insure that ITC based time
Christoph> calculations never return an earlier time. This cmpxchg
Christoph> seems to be the reason for the scalability issues.
How does the attached look/work for you?
--david
=== arch/ia64/kernel/time.c 1.42 vs edited ==--- 1.42/arch/ia64/kernel/time.c Fri Jun 4 15:44:03 2004
+++ edited/arch/ia64/kernel/time.c Fri Jun 4 21:40:44 2004
@@ -32,6 +32,7 @@
extern unsigned long wall_jiffies;
+volatile unsigned long last_nsec_offset;
u64 jiffies_64 = INITIAL_JIFFIES;
EXPORT_SYMBOL(jiffies_64);
@@ -48,6 +49,7 @@
static void
itc_reset (void)
{
+ last_nsec_offset = 0;
}
/*
@@ -57,6 +59,17 @@
static void
itc_update (long delta_nsec)
{
+ if (last_nsec_offset > 0) {
+ unsigned long new, old;
+
+ do {
+ old = last_nsec_offset;
+ if (old > delta_nsec)
+ new = old - delta_nsec;
+ else
+ new = 0;
+ } while (cmpxchg(&last_nsec_offset, old, new) != old);
+ }
}
/*
@@ -83,7 +96,8 @@
static struct time_interpolator itc_interpolator = {
.get_offset = itc_get_offset,
.update = itc_update,
- .reset = itc_reset
+ .reset = itc_reset,
+ .is_drifty = 1
};
int
@@ -139,6 +153,10 @@
}
if (unlikely(read_seqretry(&xtime_lock, seq)))
continue;
+
+ if (!time_interpolator->is_drifty)
+ break;
+
/*
* Ensure that for any pair of causally ordered gettimeofday() calls, time
* never goes backwards (even when ITC on different CPUs are not perfectly
=== include/linux/timex.h 1.12 vs edited ==--- 1.12/include/linux/timex.h Mon Oct 27 11:53:38 2003
+++ edited/include/linux/timex.h Fri Jun 4 21:40:31 2004
@@ -325,6 +325,8 @@
unsigned long (*get_offset) (void);
void (*update) (long);
void (*reset) (void);
+ /* set to 1 if interpolation-value may vary (slightly) from CPU to CPU: */
+ unsigned long is_drifty : 1;
/* cache-cold stuff follows here: */
struct time_interpolator *next;
@@ -332,10 +334,6 @@
long drift; /* drift in parts-per-million (or -1) */
};
-extern volatile unsigned long last_nsec_offset;
-#ifndef __HAVE_ARCH_CMPXCHG
-extern spin_lock_t last_nsec_offset_lock;
-#endif
extern struct time_interpolator *time_interpolator;
extern void register_time_interpolator(struct time_interpolator *);
@@ -347,31 +345,6 @@
{
struct time_interpolator *ti = time_interpolator;
- if (last_nsec_offset > 0) {
-#ifdef __HAVE_ARCH_CMPXCHG
- unsigned long new, old;
-
- do {
- old = last_nsec_offset;
- if (old > delta_nsec)
- new = old - delta_nsec;
- else
- new = 0;
- } while (cmpxchg(&last_nsec_offset, old, new) != old);
-#else
- /*
- * This really hurts, because it serializes gettimeofday(), but without an
- * atomic single-word compare-and-exchange, there isn't all that much else
- * we can do.
- */
- spin_lock(&last_nsec_offset_lock);
- {
- last_nsec_offset -= min(last_nsec_offset, delta_nsec);
- }
- spin_unlock(&last_nsec_offset_lock);
-#endif
- }
-
if (ti)
(*ti->update)(delta_nsec);
}
@@ -382,7 +355,6 @@
{
struct time_interpolator *ti = time_interpolator;
- last_nsec_offset = 0;
if (ti)
(*ti->reset)();
}
@@ -394,7 +366,7 @@
struct time_interpolator *ti = time_interpolator;
if (ti)
return (*ti->get_offset)();
- return last_nsec_offset;
+ return 0;
}
#else /* !CONFIG_TIME_INTERPOLATION */
=== kernel/timer.c 1.80 vs edited ==--- 1.80/kernel/timer.c Thu May 27 16:30:37 2004
+++ edited/kernel/timer.c Fri Jun 4 21:56:21 2004
@@ -1426,10 +1426,6 @@
}
#ifdef CONFIG_TIME_INTERPOLATION
-volatile unsigned long last_nsec_offset;
-#ifndef __HAVE_ARCH_CMPXCHG
-spinlock_t last_nsec_offset_lock = SPIN_LOCK_UNLOCKED;
-#endif
struct time_interpolator *time_interpolator;
static struct time_interpolator *time_interpolator_list;
^ permalink raw reply [flat|nested] 26+ messages in thread* Re: Scalability enhancements for gettimeofday
2004-05-20 19:05 Scalability enhancements for gettimeofday Christoph Lameter
` (8 preceding siblings ...)
2004-06-05 5:12 ` David Mosberger
@ 2004-06-07 17:46 ` Christoph Lameter
2004-06-07 18:14 ` David Mosberger
` (14 subsequent siblings)
24 siblings, 0 replies; 26+ messages in thread
From: Christoph Lameter @ 2004-06-07 17:46 UTC (permalink / raw)
To: linux-ia64
Would it be possible to avoid any changes to arch independent code by
changing get_offset fir the ITC clock to never return an earlier offset in
the same tick?
Then the ITC specific code in the gettimeofday could be removed.
There is a use of volatile that be better avoided (I think the general
recommendation on l-k is to use barriers). The fieldname "is_drifty" can
lead to misunderstandings since drift is typically used to refer to
continual drift due to inaccuracies of the clocks and not because a
variety of clocks are in use.
On Fri, 4 Jun 2004, David Mosberger wrote:
> >>>>> On Thu, 20 May 2004 12:05:57 -0700, Christoph Lameter <clameter@sgi.com> said:
>
> Christoph> The Altix systems use a global clock as the time source
> Christoph> for gettimeofday and not the ITC. In do_gettimeofday() a
> Christoph> cmpxchg instruction is used to insure that ITC based time
> Christoph> calculations never return an earlier time. This cmpxchg
> Christoph> seems to be the reason for the scalability issues.
>
> How does the attached look/work for you?
^ permalink raw reply [flat|nested] 26+ messages in thread* Re: Scalability enhancements for gettimeofday
2004-05-20 19:05 Scalability enhancements for gettimeofday Christoph Lameter
` (9 preceding siblings ...)
2004-06-07 17:46 ` Christoph Lameter
@ 2004-06-07 18:14 ` David Mosberger
2004-06-07 20:24 ` Christoph Lameter
` (13 subsequent siblings)
24 siblings, 0 replies; 26+ messages in thread
From: David Mosberger @ 2004-06-07 18:14 UTC (permalink / raw)
To: linux-ia64
>>>>> On Mon, 7 Jun 2004 10:46:41 -0700 (PDT), Christoph Lameter <clameter@sgi.com> said:
Christoph> Would it be possible to avoid any changes to arch
Christoph> independent code by changing get_offset fir the ITC clock
Christoph> to never return an earlier offset in the same tick?
I didn't think so. The problem is that for ITC-interpolation, we
basically need a two-phase protocol: get an offset and if everything
is consistent, commit to the new offset. You're welcome to try,
though.
--david
^ permalink raw reply [flat|nested] 26+ messages in thread* Re: Scalability enhancements for gettimeofday
2004-05-20 19:05 Scalability enhancements for gettimeofday Christoph Lameter
` (10 preceding siblings ...)
2004-06-07 18:14 ` David Mosberger
@ 2004-06-07 20:24 ` Christoph Lameter
2004-06-07 20:55 ` David Mosberger
` (12 subsequent siblings)
24 siblings, 0 replies; 26+ messages in thread
From: Christoph Lameter @ 2004-06-07 20:24 UTC (permalink / raw)
To: linux-ia64
On Mon, 7 Jun 2004, David Mosberger wrote:
> >>>>> On Mon, 7 Jun 2004 10:46:41 -0700 (PDT), Christoph Lameter <clameter@sgi.com> said:
>
> Christoph> Would it be possible to avoid any changes to arch
> Christoph> independent code by changing get_offset fir the ITC clock
> Christoph> to never return an earlier offset in the same tick?
>
> I didn't think so. The problem is that for ITC-interpolation, we
> basically need a two-phase protocol: get an offset and if everything
> is consistent, commit to the new offset. You're welcome to try,
> though.
How would you like the following patch? It simply uses a static variable
to insure that the ITC value used in get_offset never goes backward.
gettimeofday then becomes much simpler.
This still needs the following improvements that I probably should not
waste time with if this is not acceptable:
- last_itc should really be updated with cmpxchg
- the gettimeofday fast path assembly code would have to be updated.
Index: linux-2.6.6/arch/ia64/kernel/time.c
=================================--- linux-2.6.6.orig/arch/ia64/kernel/time.c 2004-05-09 19:32:24.000000000 -0700
+++ linux-2.6.6/arch/ia64/kernel/time.c 2004-06-07 13:11:05.817463999 -0700
@@ -77,9 +77,17 @@
unsigned long
itc_get_offset (void)
{
+ static unsigned long last_itc=0;
unsigned long elapsed_cycles, lost = jiffies - wall_jiffies;
unsigned long now = ia64_get_itc(), last_tick;
+
+ /* Make sure time does not run backwards but allow warparound */
+ if (last_itc && now<last_itc && last_itc-now<0x8000000000000000UL)
+ now=last_itc;
+ else
+ last_itc=now;
+
last_tick = (cpu_data(TIME_KEEPER_ID)->itm_next
- (lost + 1)*cpu_data(TIME_KEEPER_ID)->itm_delta);
@@ -134,51 +142,14 @@
void
do_gettimeofday (struct timeval *tv)
{
- unsigned long seq, nsec, usec, sec, old, offset;
+ unsigned long seq, nsec, usec, sec, offset;
- while (1) {
+ do {
seq = read_seqbegin(&xtime_lock);
- {
- old = last_nsec_offset;
- offset = time_interpolator_get_offset();
- sec = xtime.tv_sec;
- nsec = xtime.tv_nsec;
- }
- if (unlikely(read_seqretry(&xtime_lock, seq)))
- continue;
- /*
- * Ensure that for any pair of causally ordered gettimeofday() calls, time
- * never goes backwards (even when ITC on different CPUs are not perfectly
- * synchronized). (A pair of concurrent calls to gettimeofday() is by
- * definition non-causal and hence it makes no sense to talk about
- * time-continuity for such calls.)
- *
- * Doing this in a lock-free and race-free manner is tricky. Here is why
- * it works (most of the time): read_seqretry() just succeeded, which
- * implies we calculated a consistent (valid) value for "offset". If the
- * cmpxchg() below succeeds, we further know that last_nsec_offset still
- * has the same value as at the beginning of the loop, so there was
- * presumably no timer-tick or other updates to last_nsec_offset in the
- * meantime. This isn't 100% true though: there _is_ a possibility of a
- * timer-tick occurring right right after read_seqretry() and then getting
- * zero or more other readers which will set last_nsec_offset to the same
- * value as the one we read at the beginning of the loop. If this
- * happens, we'll end up returning a slightly newer time than we ought to
- * (the jump forward is at most "offset" nano-seconds). There is no
- * danger of causing time to go backwards, though, so we are safe in that
- * sense. We could make the probability of this unlucky case occurring
- * arbitrarily small by encoding a version number in last_nsec_offset, but
- * even without versioning, the probability of this unlucky case should be
- * so small that we won't worry about it.
- */
- if (offset <= old) {
- offset = old;
- break;
- } else if (likely(cmpxchg(&last_nsec_offset, old, offset) = old))
- break;
-
- /* someone else beat us to updating last_nsec_offset; try again */
- }
+ offset = time_interpolator_get_offset();
+ sec = xtime.tv_sec;
+ nsec = xtime.tv_nsec;
+ } while (unlikely(read_seqretry(&xtime_lock, seq)));
usec = (nsec + offset) / 1000;
^ permalink raw reply [flat|nested] 26+ messages in thread* Re: Scalability enhancements for gettimeofday
2004-05-20 19:05 Scalability enhancements for gettimeofday Christoph Lameter
` (11 preceding siblings ...)
2004-06-07 20:24 ` Christoph Lameter
@ 2004-06-07 20:55 ` David Mosberger
2004-06-07 22:48 ` Christoph Lameter
` (11 subsequent siblings)
24 siblings, 0 replies; 26+ messages in thread
From: David Mosberger @ 2004-06-07 20:55 UTC (permalink / raw)
To: linux-ia64
>>>>> On Mon, 7 Jun 2004 13:24:45 -0700 (PDT), Christoph Lameter <clameter@sgi.com> said:
Christoph> How would you like the following patch? It simply uses a
Christoph> static variable to insure that the ITC value used in
Christoph> get_offset never goes backward.
Eh, you need to worry about races. There can be a timer-tick while
get_offset is running.
--david
^ permalink raw reply [flat|nested] 26+ messages in thread* Re: Scalability enhancements for gettimeofday
2004-05-20 19:05 Scalability enhancements for gettimeofday Christoph Lameter
` (12 preceding siblings ...)
2004-06-07 20:55 ` David Mosberger
@ 2004-06-07 22:48 ` Christoph Lameter
2004-06-07 23:02 ` David Mosberger
` (10 subsequent siblings)
24 siblings, 0 replies; 26+ messages in thread
From: Christoph Lameter @ 2004-06-07 22:48 UTC (permalink / raw)
To: linux-ia64
On Mon, 7 Jun 2004, David Mosberger wrote:
> >>>>> On Mon, 7 Jun 2004 13:24:45 -0700 (PDT), Christoph Lameter <clameter@sgi.com> said:
>
> Christoph> How would you like the following patch? It simply uses a
> Christoph> static variable to insure that the ITC value used in
> Christoph> get_offset never goes backward.
>
> Eh, you need to worry about races. There can be a timer-tick while
> get_offset is running.
get_offset is invoked from gettimeofday with a seqlock. In that case it
would repeat its actions. I can put an explicit seqlock into get_offset?
^ permalink raw reply [flat|nested] 26+ messages in thread* Re: Scalability enhancements for gettimeofday
2004-05-20 19:05 Scalability enhancements for gettimeofday Christoph Lameter
` (13 preceding siblings ...)
2004-06-07 22:48 ` Christoph Lameter
@ 2004-06-07 23:02 ` David Mosberger
2004-06-08 4:08 ` Christoph Lameter
` (9 subsequent siblings)
24 siblings, 0 replies; 26+ messages in thread
From: David Mosberger @ 2004-06-07 23:02 UTC (permalink / raw)
To: linux-ia64
>>>>> On Mon, 7 Jun 2004 15:48:57 -0700 (PDT), Christoph Lameter <clameter@sgi.com> said:
>> Eh, you need to worry about races. There can be a timer-tick
>> while get_offset is running.
Christoph> get_offset is invoked from gettimeofday with a
Christoph> seqlock. In that case it would repeat its actions. I can
Christoph> put an explicit seqlock into get_offset?
I guess I wasn't clear enough: if you update "last_nsec_offset" in
get_offset(), then you might have committed a "bad" value into a
global variable. Bad. The seq-lock won't help you there, because the
global variable alreayd has been polluted with a bad value.
--david
^ permalink raw reply [flat|nested] 26+ messages in thread* Re: Scalability enhancements for gettimeofday
2004-05-20 19:05 Scalability enhancements for gettimeofday Christoph Lameter
` (14 preceding siblings ...)
2004-06-07 23:02 ` David Mosberger
@ 2004-06-08 4:08 ` Christoph Lameter
2004-06-08 4:33 ` David Mosberger
` (8 subsequent siblings)
24 siblings, 0 replies; 26+ messages in thread
From: Christoph Lameter @ 2004-06-08 4:08 UTC (permalink / raw)
To: linux-ia64
On Mon, 7 Jun 2004, David Mosberger wrote:
> >>>>> On Mon, 7 Jun 2004 15:48:57 -0700 (PDT), Christoph Lameter <clameter@sgi.com> said:
>
> >> Eh, you need to worry about races. There can be a timer-tick
> >> while get_offset is running.
>
> Christoph> get_offset is invoked from gettimeofday with a
> Christoph> seqlock. In that case it would repeat its actions. I can
> Christoph> put an explicit seqlock into get_offset?
>
> I guess I wasn't clear enough: if you update "last_nsec_offset" in
> get_offset(), then you might have committed a "bad" value into a
> global variable. Bad. The seq-lock won't help you there, because the
> global variable alreayd has been polluted with a bad value.
Here is the patch with the use of cmpxchg to update last_itc:
Index: linux-2.6.6/arch/ia64/kernel/time.c
=================================--- linux-2.6.6.orig/arch/ia64/kernel/time.c 2004-05-09 19:32:24.000000000 -0700
+++ linux-2.6.6/arch/ia64/kernel/time.c 2004-06-07 21:03:42.551491629 -0700
@@ -77,12 +77,30 @@
unsigned long
itc_get_offset (void)
{
- unsigned long elapsed_cycles, lost = jiffies - wall_jiffies;
- unsigned long now = ia64_get_itc(), last_tick;
+ static unsigned long last_itc=0;
+ unsigned long elapsed_cycles, lost;
+ unsigned long now, last_tick;
+ unsigned long litc;
- last_tick = (cpu_data(TIME_KEEPER_ID)->itm_next
+ /* Determine ITC value that is >= than the last one used */
+ do {
+ litc=last_itc;
+ now = ia64_get_itc();
+ lost = jiffies - wall_jiffies;
+ last_tick = (cpu_data(TIME_KEEPER_ID)->itm_next
- (lost + 1)*cpu_data(TIME_KEEPER_ID)->itm_delta);
-
+
+ /* Make sure time does not run backwards but allow warparound */
+ if (last_itc && now<last_itc && last_itc-now<0x8000000000000000UL)
+ {
+ /* Time would be running backwards if we would use this ITC value.
+ * Use the last ITC value instead.
+ */
+ now=litc;
+ break;
+ }
+ } while (unlikely(cmpxchg(&last_itc,litc,now)));
+
elapsed_cycles = now - last_tick;
return (elapsed_cycles*local_cpu_data->nsec_per_cyc) >> IA64_NSEC_PER_CYC_SHIFT;
}
@@ -134,51 +152,14 @@
void
do_gettimeofday (struct timeval *tv)
{
- unsigned long seq, nsec, usec, sec, old, offset;
+ unsigned long seq, nsec, usec, sec, offset;
- while (1) {
+ do {
seq = read_seqbegin(&xtime_lock);
- {
- old = last_nsec_offset;
- offset = time_interpolator_get_offset();
- sec = xtime.tv_sec;
- nsec = xtime.tv_nsec;
- }
- if (unlikely(read_seqretry(&xtime_lock, seq)))
- continue;
- /*
- * Ensure that for any pair of causally ordered gettimeofday() calls, time
- * never goes backwards (even when ITC on different CPUs are not perfectly
- * synchronized). (A pair of concurrent calls to gettimeofday() is by
- * definition non-causal and hence it makes no sense to talk about
- * time-continuity for such calls.)
- *
- * Doing this in a lock-free and race-free manner is tricky. Here is why
- * it works (most of the time): read_seqretry() just succeeded, which
- * implies we calculated a consistent (valid) value for "offset". If the
- * cmpxchg() below succeeds, we further know that last_nsec_offset still
- * has the same value as at the beginning of the loop, so there was
- * presumably no timer-tick or other updates to last_nsec_offset in the
- * meantime. This isn't 100% true though: there _is_ a possibility of a
- * timer-tick occurring right right after read_seqretry() and then getting
- * zero or more other readers which will set last_nsec_offset to the same
- * value as the one we read at the beginning of the loop. If this
- * happens, we'll end up returning a slightly newer time than we ought to
- * (the jump forward is at most "offset" nano-seconds). There is no
- * danger of causing time to go backwards, though, so we are safe in that
- * sense. We could make the probability of this unlucky case occurring
- * arbitrarily small by encoding a version number in last_nsec_offset, but
- * even without versioning, the probability of this unlucky case should be
- * so small that we won't worry about it.
- */
- if (offset <= old) {
- offset = old;
- break;
- } else if (likely(cmpxchg(&last_nsec_offset, old, offset) = old))
- break;
-
- /* someone else beat us to updating last_nsec_offset; try again */
- }
+ offset = time_interpolator_get_offset();
+ sec = xtime.tv_sec;
+ nsec = xtime.tv_nsec;
+ } while (unlikely(read_seqretry(&xtime_lock, seq)));
usec = (nsec + offset) / 1000;
^ permalink raw reply [flat|nested] 26+ messages in thread* Re: Scalability enhancements for gettimeofday
2004-05-20 19:05 Scalability enhancements for gettimeofday Christoph Lameter
` (15 preceding siblings ...)
2004-06-08 4:08 ` Christoph Lameter
@ 2004-06-08 4:33 ` David Mosberger
2004-06-08 5:59 ` Christoph Lameter
` (7 subsequent siblings)
24 siblings, 0 replies; 26+ messages in thread
From: David Mosberger @ 2004-06-08 4:33 UTC (permalink / raw)
To: linux-ia64
>>>>> On Mon, 7 Jun 2004 21:08:10 -0700 (PDT), Christoph Lameter <clameter@sgi.com> said:
Christoph> Here is the patch with the use of cmpxchg to update
Christoph> last_itc:
You're missing my point: if read_seqretry() causes a retry,
get_offset() better had NOT modified the global time-offset.
--david
^ permalink raw reply [flat|nested] 26+ messages in thread* Re: Scalability enhancements for gettimeofday
2004-05-20 19:05 Scalability enhancements for gettimeofday Christoph Lameter
` (16 preceding siblings ...)
2004-06-08 4:33 ` David Mosberger
@ 2004-06-08 5:59 ` Christoph Lameter
2004-06-08 6:06 ` David Mosberger
` (6 subsequent siblings)
24 siblings, 0 replies; 26+ messages in thread
From: Christoph Lameter @ 2004-06-08 5:59 UTC (permalink / raw)
To: linux-ia64
On Mon, 7 Jun 2004, David Mosberger wrote:
> >>>>> On Mon, 7 Jun 2004 21:08:10 -0700 (PDT), Christoph Lameter <clameter@sgi.com> said:
>
> Christoph> Here is the patch with the use of cmpxchg to update
> Christoph> last_itc:
>
> You're missing my point: if read_seqretry() causes a retry,
> get_offset() better had NOT modified the global time-offset.
It seems that you are not looking at my code and are assuming that the
code is still using last_nsec_offset.
In this implementation get_offset can update "last_itc" however many times
wanted without any problem. last_itc just insures that ITC becomes a
monotonic clock like the other clock sources. last_nsec_offset is not
used and could be removed.
^ permalink raw reply [flat|nested] 26+ messages in thread* Re: Scalability enhancements for gettimeofday
2004-05-20 19:05 Scalability enhancements for gettimeofday Christoph Lameter
` (17 preceding siblings ...)
2004-06-08 5:59 ` Christoph Lameter
@ 2004-06-08 6:06 ` David Mosberger
2004-06-08 7:11 ` Christoph Lameter
` (5 subsequent siblings)
24 siblings, 0 replies; 26+ messages in thread
From: David Mosberger @ 2004-06-08 6:06 UTC (permalink / raw)
To: linux-ia64
>>>>> On Mon, 7 Jun 2004 22:59:09 -0700 (PDT), Christoph Lameter <clameter@sgi.com> said:
Christoph> It seems that you are not looking at my code and are
Christoph> assuming that the code is still using last_nsec_offset.
No, I'm not.
Christoph> In this implementation get_offset can update "last_itc"
Christoph> however many times wanted without any problem. last_itc
Christoph> just insures that ITC becomes a monotonic clock like the
Christoph> other clock sources. last_nsec_offset is not used and
Christoph> could be removed.
It doesn't matter what you're calling the variable. If you update a
global variable with a bogus value, you lose. Sure, it will be
monotonic, but if you get unlucky and the (bad) interpolated value
jumped forward by a huge amount, you'll get horrible resolution for a
long time to come.
--david
^ permalink raw reply [flat|nested] 26+ messages in thread* Re: Scalability enhancements for gettimeofday
2004-05-20 19:05 Scalability enhancements for gettimeofday Christoph Lameter
` (18 preceding siblings ...)
2004-06-08 6:06 ` David Mosberger
@ 2004-06-08 7:11 ` Christoph Lameter
2004-06-08 18:02 ` David Mosberger
` (4 subsequent siblings)
24 siblings, 0 replies; 26+ messages in thread
From: Christoph Lameter @ 2004-06-08 7:11 UTC (permalink / raw)
To: linux-ia64
On Mon, 7 Jun 2004, David Mosberger wrote:
> Christoph> In this implementation get_offset can update "last_itc"
> Christoph> however many times wanted without any problem. last_itc
> Christoph> just insures that ITC becomes a monotonic clock like the
> Christoph> other clock sources. last_nsec_offset is not used and
> Christoph> could be removed.
>
> It doesn't matter what you're calling the variable. If you update a
> global variable with a bogus value, you lose. Sure, it will be
> monotonic, but if you get unlucky and the (bad) interpolated value
> jumped forward by a huge amount, you'll get horrible resolution for a
> long time to come.
How could this implementation put a bogus value into last_itc?
^ permalink raw reply [flat|nested] 26+ messages in thread* Re: Scalability enhancements for gettimeofday
2004-05-20 19:05 Scalability enhancements for gettimeofday Christoph Lameter
` (19 preceding siblings ...)
2004-06-08 7:11 ` Christoph Lameter
@ 2004-06-08 18:02 ` David Mosberger
2004-06-08 18:32 ` Christoph Lameter
` (3 subsequent siblings)
24 siblings, 0 replies; 26+ messages in thread
From: David Mosberger @ 2004-06-08 18:02 UTC (permalink / raw)
To: linux-ia64
>>>>> On Tue, 8 Jun 2004 00:11:25 -0700 (PDT), Christoph Lameter <christoph@lameter.com> said:
Christoph> How could this implementation put a bogus value into
Christoph> last_itc?
The itc_get_offset() calculation depends on TIME_KEEPER->itm_next,
"jiffies" and "wall_jiffies", all of which are updated asynchronously
in response to a timer-interrupt. If a timer-interrupt hits while
itc_get_offset() is running, the resulting value may have used
inconsistent values for those variables and hence could be bogus.
--david
^ permalink raw reply [flat|nested] 26+ messages in thread* Re: Scalability enhancements for gettimeofday
2004-05-20 19:05 Scalability enhancements for gettimeofday Christoph Lameter
` (20 preceding siblings ...)
2004-06-08 18:02 ` David Mosberger
@ 2004-06-08 18:32 ` Christoph Lameter
2004-06-08 21:38 ` David Mosberger
` (2 subsequent siblings)
24 siblings, 0 replies; 26+ messages in thread
From: Christoph Lameter @ 2004-06-08 18:32 UTC (permalink / raw)
To: linux-ia64
On Tue, 8 Jun 2004, David Mosberger wrote:
> >>>>> On Tue, 8 Jun 2004 00:11:25 -0700 (PDT), Christoph Lameter <christoph@lameter.com> said:
>
> Christoph> How could this implementation put a bogus value into
> Christoph> last_itc?
>
> The itc_get_offset() calculation depends on TIME_KEEPER->itm_next,
> "jiffies" and "wall_jiffies", all of which are updated asynchronously
> in response to a timer-interrupt. If a timer-interrupt hits while
> itc_get_offset() is running, the resulting value may have used
> inconsistent values for those variables and hence could be bogus.
This does not effect the update of last_itc which does not depend on
any of the variables you mention. The dependency existed for
last_nsec_offset but does NOT exist for last_itc. There is much more than
just a name change going on.
The bogus return value from get_offset will be ignored by gettimeofday
because of the seqlock. That aspect of get_offset is the same as before.
^ permalink raw reply [flat|nested] 26+ messages in thread* Re: Scalability enhancements for gettimeofday
2004-05-20 19:05 Scalability enhancements for gettimeofday Christoph Lameter
` (21 preceding siblings ...)
2004-06-08 18:32 ` Christoph Lameter
@ 2004-06-08 21:38 ` David Mosberger
2004-06-08 21:48 ` Christoph Lameter
2004-06-08 22:13 ` David Mosberger
24 siblings, 0 replies; 26+ messages in thread
From: David Mosberger @ 2004-06-08 21:38 UTC (permalink / raw)
To: linux-ia64
>>>>> On Tue, 8 Jun 2004 11:32:46 -0700 (PDT), Christoph Lameter <christoph@lameter.com> said:
Christoph> This does not effect the update of last_itc which does
Christoph> not depend on any of the variables you mention. The
Christoph> dependency existed for last_nsec_offset but does NOT
Christoph> exist for last_itc. There is much more than just a name
Christoph> change going on.
Let me try this again: the second you update a global variable with a
bogus/inconsistent value, another CPU may observe that value. The
fact that you're going to try to correct that bad value later on
doesn't help you at all. See the problem now?
The fundamental issue is that with ITC-based interpolation, reading
the offset has a side-effect, which isn't true for global-timer-based
interpolation. Hence the need for a two-phase protocol.
--david
^ permalink raw reply [flat|nested] 26+ messages in thread* Re: Scalability enhancements for gettimeofday
2004-05-20 19:05 Scalability enhancements for gettimeofday Christoph Lameter
` (22 preceding siblings ...)
2004-06-08 21:38 ` David Mosberger
@ 2004-06-08 21:48 ` Christoph Lameter
2004-06-08 22:13 ` David Mosberger
24 siblings, 0 replies; 26+ messages in thread
From: Christoph Lameter @ 2004-06-08 21:48 UTC (permalink / raw)
To: linux-ia64
On Tue, 8 Jun 2004, David Mosberger wrote:
> >>>>> On Tue, 8 Jun 2004 11:32:46 -0700 (PDT), Christoph Lameter <christoph@lameter.com> said:
>
> Christoph> This does not effect the update of last_itc which does
> Christoph> not depend on any of the variables you mention. The
> Christoph> dependency existed for last_nsec_offset but does NOT
> Christoph> exist for last_itc. There is much more than just a name
> Christoph> change going on.
>
> Let me try this again: the second you update a global variable with a
> bogus/inconsistent value, another CPU may observe that value. The
> fact that you're going to try to correct that bad value later on
> doesn't help you at all. See the problem now?
Please show me how last_itc can *receive* a bogus value. What you are
saying cannot happen because last_itc does not depend in any way on
variables modified by the timer interrupt. Last_itc is updated using a
cmpxchg which will deal with the problem of multiple CPUs updating the
value.
A bogus value may be returned from get_offset if a timer interrupt
happens and is then properly disposed of later.
> The fundamental issue is that with ITC-based interpolation, reading
> the offset has a side-effect, which isn't true for global-timer-based
> interpolation. Hence the need for a two-phase protocol.
It does not need to have the side effect (like the use of
last_nsec_offset) if the ITC is monotonized and
therefore works like a global timer.
Here is the offset routine after my modifications. last_itc is in no way
depending on any of the timer variables but only on ITC. The timer
interrupt cannot cause an bogus value to be written to last_itc:
unsigned long
itc_get_offset (void)
{
static unsigned long last_itc=0;
unsigned long elapsed_cycles, lost;
unsigned long now, last_tick;
unsigned long litc;
/* Determine ITC value that is >= than the last one used */
do {
litc=last_itc;
now = ia64_get_itc();
lost = jiffies - wall_jiffies;
last_tick = (cpu_data(TIME_KEEPER_ID)->itm_next
- (lost + 1)*cpu_data(TIME_KEEPER_ID)->itm_delta);
/* Make sure time does not run backwards but allow warparound */
if (last_itc && now<last_itc && last_itc-now<0x8000000000000000UL)
{
/* Time would be running backwards if we would use this ITC value.
* Use the last ITC value instead.
*/
now=litc;
break;
}
} while (unlikely(cmpxchg(&last_itc,litc,now)));
elapsed_cycles = now - last_tick;
return (elapsed_cycles*local_cpu_data->nsec_per_cyc) >> IA64_NSEC_PER_CYC_SHIFT;
}
^ permalink raw reply [flat|nested] 26+ messages in thread* Re: Scalability enhancements for gettimeofday
2004-05-20 19:05 Scalability enhancements for gettimeofday Christoph Lameter
` (23 preceding siblings ...)
2004-06-08 21:48 ` Christoph Lameter
@ 2004-06-08 22:13 ` David Mosberger
24 siblings, 0 replies; 26+ messages in thread
From: David Mosberger @ 2004-06-08 22:13 UTC (permalink / raw)
To: linux-ia64
>>>>> On Tue, 8 Jun 2004 14:48:20 -0700 (PDT), Christoph Lameter <christoph@lameter.com> said:
Christoph> Please show me how last_itc can *receive* a bogus
Christoph> value. What you are saying cannot happen because last_itc
Christoph> does not depend in any way on variables modified by the
Christoph> timer interrupt. Last_itc is updated using a cmpxchg
Christoph> which will deal with the problem of multiple CPUs
Christoph> updating the value.
Ah, I see now what you mean. Yes, tracking the maximum ITC rather
than a relative offsets is a nice trick that solves the problem of the
ITC-based getoffset not being "restartable".
What threw me off is that you continue to read jiffies and
wall_jiffies in the do/while-loop for the cmpxchg. I don't see why
that's necessary/helpful. If you get a tick, you'll throw away the
result of the routine anyhow so you might just as well move it out of
the loop.
So I think your patch does work. I'd only ask that you replace the
explicit "check-for-earlier while allowing for wraparound" with a call
to the time_before() macro (or at least the equivalent open-code, if
you're nervous about using jiffies-related macros for non-jiffy
values).
Thanks,
--david
^ permalink raw reply [flat|nested] 26+ messages in thread