* clock synchronization utility code @ 2009-02-04 13:01 Patrick Ohly 2009-02-04 13:01 ` [PATCH NET-NEXT 01/10] clocksource: allow usage independent of timekeeping.c Patrick Ohly 0 siblings, 1 reply; 23+ messages in thread From: Patrick Ohly @ 2009-02-04 13:01 UTC (permalink / raw) To: linux-kernel; +Cc: netdev, David Miller, John Stultz, Thomas Gleixner Hello! These two patches are part of the larger patch series which adds support for a hardware assisted implementation of the Precision Time Protocol (PTP, IEEE 1588). They apply to net-next-2.6 as of a few days ago, which itself was recently merged with v2.6.29-rc2. It seems that we have reached a consensus how the networking infrastructure needs to be changed; I have already adapted the patch series accordingly. See the mail thread "hardware time stamping with optional structs in data area" for details: http://kerneltrap.org/mailarchive/linux-netdev/2009/1/21/4781774 We are less sure about these two patches because they are outside of the network subsystem. The clocksource patch was already reviewed by John and hasn't been changed since then. The second patch hasn't been reviewed. Both patches add code which is not called and has no effect unless a driver developer decides to use this utility code. The larger patch series contains patches to the igb driver which invoke the code. This is how I tested it on 32 and 64 bit x86. How should we proceed with these patches? David and I agree that it would make sense to include them via the net-next-2.6 together with the rest of the patch series. That way we ensure that no dead code without users ends up in the kernel. Please let us know how we can coordinate this so that friction between the subsystem trees is minimized. Diff summary: include/linux/clocksource.h | 101 ++++++++++++++++++++++ include/linux/clocksync.h | 102 ++++++++++++++++++++++ kernel/time/Makefile | 2 kernel/time/clocksource.c | 77 ++++++++++++++++- kernel/time/clocksync.c | 198 +++++++++++++++++++++++++++++++++++++++++++- 5 files changed, 477 insertions(+), 3 deletions(-) -- Best Regards, Patrick Ohly The content of this message is my personal opinion only and although I am an employee of Intel, the statements I make here in no way represent Intel's position on the issue, nor am I authorized to speak on behalf of Intel on this matter. ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH NET-NEXT 01/10] clocksource: allow usage independent of timekeeping.c 2009-02-04 13:01 clock synchronization utility code Patrick Ohly @ 2009-02-04 13:01 ` Patrick Ohly 2009-02-04 13:01 ` [PATCH NET-NEXT 02/10] time sync: generic infrastructure to map between time stamps generated by a time counter and system time Patrick Ohly 2009-02-04 14:03 ` [PATCH NET-NEXT 01/10] clocksource: allow usage independent of timekeeping.c Daniel Walker 0 siblings, 2 replies; 23+ messages in thread From: Patrick Ohly @ 2009-02-04 13:01 UTC (permalink / raw) To: linux-kernel Cc: netdev, David Miller, John Stultz, Thomas Gleixner, Patrick Ohly So far struct clocksource acted as the interface between time/timekeeping.c and hardware. This patch generalizes the concept so that a similar interface can also be used in other contexts. For that it introduces new structures and related functions *without* touching the existing struct clocksource. The reasons for adding these new structures to clocksource.[ch] are * the APIs are clearly related * struct clocksource could be cleaned up to use the new structs * avoids proliferation of files with similar names (timesource.h? timecounter.h?) As outlined in the discussion with John Stultz, this patch adds * struct cyclecounter: stateless API to hardware which counts clock cycles * struct timecounter: stateful utility code built on a cyclecounter which provides a nanosecond counter * only the function to read the nanosecond counter; deltas are used internally and not exposed to users of timecounter The code does no locking of the shared state. It must be called at least as often as the cycle counter wraps around to detect these wrap arounds. Both is the responsibility of the timecounter user. Acked-by: John Stultz <johnstul@us.ibm.com> --- include/linux/clocksource.h | 101 +++++++++++++++++++++++++++++++++++++++++++ kernel/time/clocksource.c | 76 ++++++++++++++++++++++++++++++++ 2 files changed, 177 insertions(+), 0 deletions(-) diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h index f88d32f..0729ce2 100644 --- a/include/linux/clocksource.h +++ b/include/linux/clocksource.h @@ -22,8 +22,109 @@ typedef u64 cycle_t; struct clocksource; /** + * struct cyclecounter - hardware abstraction for a free running counter + * Provides completely state-free accessors to the underlying hardware. + * Depending on which hardware it reads, the cycle counter may wrap + * around quickly. Locking rules (if necessary) have to be defined + * by the implementor and user of specific instances of this API. + * + * @read: returns the current cycle value + * @mask: bitmask for two's complement + * subtraction of non 64 bit counters, + * see CLOCKSOURCE_MASK() helper macro + * @mult: cycle to nanosecond multiplier + * @shift: cycle to nanosecond divisor (power of two) + */ +struct cyclecounter { + cycle_t (*read)(const struct cyclecounter *cc); + cycle_t mask; + u32 mult; + u32 shift; +}; + +/** + * struct timecounter - layer above a %struct cyclecounter which counts nanoseconds + * Contains the state needed by timecounter_read() to detect + * cycle counter wrap around. Initialize with + * timecounter_init(). Also used to convert cycle counts into the + * corresponding nanosecond counts with timecounter_cyc2time(). Users + * of this code are responsible for initializing the underlying + * cycle counter hardware, locking issues and reading the time + * more often than the cycle counter wraps around. The nanosecond + * counter will only wrap around after ~585 years. + * + * @cc: the cycle counter used by this instance + * @cycle_last: most recent cycle counter value seen by + * timecounter_read() + * @nsec: continuously increasing count + */ +struct timecounter { + const struct cyclecounter *cc; + cycle_t cycle_last; + u64 nsec; +}; + +/** + * cyclecounter_cyc2ns - converts cycle counter cycles to nanoseconds + * @tc: Pointer to cycle counter. + * @cycles: Cycles + * + * XXX - This could use some mult_lxl_ll() asm optimization. Same code + * as in cyc2ns, but with unsigned result. + */ +static inline u64 cyclecounter_cyc2ns(const struct cyclecounter *cc, + cycle_t cycles) +{ + u64 ret = (u64)cycles; + ret = (ret * cc->mult) >> cc->shift; + return ret; +} + +/** + * timecounter_init - initialize a time counter + * @tc: Pointer to time counter which is to be initialized/reset + * @cc: A cycle counter, ready to be used. + * @start_tstamp: Arbitrary initial time stamp. + * + * After this call the current cycle register (roughly) corresponds to + * the initial time stamp. Every call to timecounter_read() increments + * the time stamp counter by the number of elapsed nanoseconds. + */ +extern void timecounter_init(struct timecounter *tc, + const struct cyclecounter *cc, + u64 start_tstamp); + +/** + * timecounter_read - return nanoseconds elapsed since timecounter_init() + * plus the initial time stamp + * @tc: Pointer to time counter. + * + * In other words, keeps track of time since the same epoch as + * the function which generated the initial time stamp. + */ +extern u64 timecounter_read(struct timecounter *tc); + +/** + * timecounter_cyc2time - convert a cycle counter to same + * time base as values returned by + * timecounter_read() + * @tc: Pointer to time counter. + * @cycle: a value returned by tc->cc->read() + * + * Cycle counts that are converted correctly as long as they + * fall into the interval [-1/2 max cycle count, +1/2 max cycle count], + * with "max cycle count" == cs->mask+1. + * + * This allows conversion of cycle counter values which were generated + * in the past. + */ +extern u64 timecounter_cyc2time(struct timecounter *tc, + cycle_t cycle_tstamp); + +/** * struct clocksource - hardware abstraction for a free running counter * Provides mostly state-free accessors to the underlying hardware. + * This is the structure used for system time. * * @name: ptr to clocksource name * @list: list head for registration diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c index ca89e15..6d4a267 100644 --- a/kernel/time/clocksource.c +++ b/kernel/time/clocksource.c @@ -31,6 +31,82 @@ #include <linux/sched.h> /* for spin_unlock_irq() using preempt_count() m68k */ #include <linux/tick.h> +void timecounter_init(struct timecounter *tc, + const struct cyclecounter *cc, + u64 start_tstamp) +{ + tc->cc = cc; + tc->cycle_last = cc->read(cc); + tc->nsec = start_tstamp; +} +EXPORT_SYMBOL(timecounter_init); + +/** + * clocksource_read_ns - get nanoseconds since last call of this function + * @tc: Pointer to time counter + * + * When the underlying cycle counter runs over, this will be handled + * correctly as long as it does not run over more than once between + * calls. + * + * The first call to this function for a new time counter initializes + * the time tracking and returns bogus results. + */ +static u64 timecounter_read_delta(struct timecounter *tc) +{ + cycle_t cycle_now, cycle_delta; + u64 ns_offset; + + /* read cycle counter: */ + cycle_now = tc->cc->read(tc->cc); + + /* calculate the delta since the last timecounter_read_delta(): */ + cycle_delta = (cycle_now - tc->cycle_last) & tc->cc->mask; + + /* convert to nanoseconds: */ + ns_offset = cyclecounter_cyc2ns(tc->cc, cycle_delta); + + /* update time stamp of timecounter_read_delta() call: */ + tc->cycle_last = cycle_now; + + return ns_offset; +} + +u64 timecounter_read(struct timecounter *tc) +{ + u64 nsec; + + /* increment time by nanoseconds since last call */ + nsec = timecounter_read_delta(tc); + nsec += tc->nsec; + tc->nsec = nsec; + + return nsec; +} +EXPORT_SYMBOL(timecounter_read); + +u64 timecounter_cyc2time(struct timecounter *tc, + cycle_t cycle_tstamp) +{ + u64 cycle_delta = (cycle_tstamp - tc->cycle_last) & tc->cc->mask; + u64 nsec; + + /* + * Instead of always treating cycle_tstamp as more recent + * than tc->cycle_last, detect when it is too far in the + * future and treat it as old time stamp instead. + */ + if (cycle_delta > tc->cc->mask / 2) { + cycle_delta = (tc->cycle_last - cycle_tstamp) & tc->cc->mask; + nsec = tc->nsec - cyclecounter_cyc2ns(tc->cc, cycle_delta); + } else { + nsec = cyclecounter_cyc2ns(tc->cc, cycle_delta) + tc->nsec; + } + + return nsec; +} +EXPORT_SYMBOL(timecounter_cyc2time); + /* XXX - Would like a better way for initializing curr_clocksource */ extern struct clocksource clocksource_jiffies; -- 1.6.0.4 ^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH NET-NEXT 02/10] time sync: generic infrastructure to map between time stamps generated by a time counter and system time 2009-02-04 13:01 ` [PATCH NET-NEXT 01/10] clocksource: allow usage independent of timekeeping.c Patrick Ohly @ 2009-02-04 13:01 ` Patrick Ohly 2009-02-04 19:44 ` john stultz 2009-02-04 14:03 ` [PATCH NET-NEXT 01/10] clocksource: allow usage independent of timekeeping.c Daniel Walker 1 sibling, 1 reply; 23+ messages in thread From: Patrick Ohly @ 2009-02-04 13:01 UTC (permalink / raw) To: linux-kernel Cc: netdev, David Miller, John Stultz, Thomas Gleixner, Patrick Ohly Mapping from time counter to system time is implemented. This is sufficient to use this code in a network device driver which wants to support hardware time stamping and transformation of hardware time stamps to system time. The interface could have been made more versatile by not depending on a time counter, but this wasn't done to avoid writing glue code elsewhere. The method implemented here is the one used and analyzed under the name "assisted PTP" in the LCI PTP paper: http://www.linuxclustersinstitute.org/conferences/archive/2008/PDF/Ohly_92221.pdf --- include/linux/clocksync.h | 102 +++++++++++++++++++++++ kernel/time/Makefile | 2 +- kernel/time/clocksync.c | 197 +++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 300 insertions(+), 1 deletions(-) create mode 100644 include/linux/clocksync.h create mode 100644 kernel/time/clocksync.c diff --git a/include/linux/clocksync.h b/include/linux/clocksync.h new file mode 100644 index 0000000..07c0cc1 --- /dev/null +++ b/include/linux/clocksync.h @@ -0,0 +1,102 @@ +/* + * Utility code which helps transforming between hardware time stamps + * generated by a clocksource and system time. The clocksource is + * assumed to return monotonically increasing time (but this code does + * its best to compensate if that is not the case) whereas system time + * may jump. + * + * Copyright(c) 2009 Intel Corporation. + * Author: Patrick Ohly <patrick.ohly@intel.com> + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. * See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program; if not, write to the Free Software Foundation, Inc., + * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA. + */ +#ifndef _LINUX_CLOCKSYNC_H +#define _LINUX_CLOCKSYNC_H + +#include <linux/clocksource.h> +#include <linux/ktime.h> + +/** + * struct clocksync - stores state and configuration for the two clocks + * + * Initialize to zero, then set clock, systime, num_samples. + * + * Transformation between HW time and system time is done with: + * HW time transformed = HW time + offset + + * (HW time - last_update) * skew / + * CLOCKSYNC_SKEW_RESOLUTION + * + * @clock: the source for HW time stamps (%clocksource_read_time) + * @systime: function returning current system time (ktime_get + * for monotonic time, or ktime_get_real for wall clock) + * @num_samples: number of times that HW time and system time are to + * be compared when determining their offset + * @offset: (system time - HW time) at the time of the last update + * @skew: average (system time - HW time) / delta HW time * + * CLOCKSYNC_SKEW_RESOLUTION + * @last_update: last HW time stamp when clock offset was measured + */ +struct clocksync { + struct timecounter *clock; + ktime_t (*systime)(void); + int num_samples; + + s64 offset; + s64 skew; + u64 last_update; +}; + +/** + * clocksync_hw2sys - transform HW time stamp into corresponding system time + * @sync: context for clock sync + * @hwtstamp: the result of timecounter_read() or + * timecounter_cyc2time() + */ +extern ktime_t clocksync_hw2sys(struct clocksync *sync, + u64 hwtstamp); + +/** + * clocksync_offset - measure current (system time - HW time) offset + * @sync: context for clock sync + * @offset: average offset during sample period returned here + * @hwtstamp: average HW time during sample period returned here + * + * Returns number of samples used. Might be zero (= no result) in the + * unlikely case that system time was monotonically decreasing for all + * samples (= broken). + */ +extern int clocksync_offset(struct clocksync *sync, + s64 *offset, + u64 *hwtstamp); + +extern void __clocksync_update(struct clocksync *sync, + u64 hwtstamp); + +/** + * clocksync_update - update offset and skew by measuring current offset + * @sync: context for clock sync + * @hwtstamp: the result of timecounter_read() or + * timecounter_cyc2time(), pass zero to force update + * + * Updates are only done at most once per second. + */ +static inline void clocksync_update(struct clocksync *sync, + u64 hwtstamp) +{ + if (!hwtstamp || + (s64)(hwtstamp - sync->last_update) >= NSEC_PER_SEC) + __clocksync_update(sync, hwtstamp); +} + +#endif /* _LINUX_CLOCKSYNC_H */ diff --git a/kernel/time/Makefile b/kernel/time/Makefile index 905b0b5..6279fb0 100644 --- a/kernel/time/Makefile +++ b/kernel/time/Makefile @@ -1,4 +1,4 @@ -obj-y += timekeeping.o ntp.o clocksource.o jiffies.o timer_list.o +obj-y += timekeeping.o ntp.o clocksource.o jiffies.o timer_list.o clocksync.o obj-$(CONFIG_GENERIC_CLOCKEVENTS_BUILD) += clockevents.o obj-$(CONFIG_GENERIC_CLOCKEVENTS) += tick-common.o diff --git a/kernel/time/clocksync.c b/kernel/time/clocksync.c new file mode 100644 index 0000000..c10857d --- /dev/null +++ b/kernel/time/clocksync.c @@ -0,0 +1,197 @@ +/* + * Utility code which helps transforming between hardware time stamps + * generated by a timecounter and system time. + * + * Copyright (C) 20098 Intel. + * Author: Patrick Ohly <patrick.ohly@intel.com> + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + */ + +#include <linux/clocksync.h> +#include <linux/module.h> +#include <linux/math64.h> + +/* + * fixed point arithmetic scale factor for skew + * + * Usually one would measure skew in ppb (parts per billion, 1e9), but + * using a factor of 2 simplifies the math. + */ +#define CLOCKSYNC_SKEW_RESOLUTION (((s64)1)<<30) + +ktime_t clocksync_hw2sys(struct clocksync *sync, + u64 hwtstamp) +{ + u64 nsec; + + nsec = hwtstamp + sync->offset; + nsec += (s64)(hwtstamp - sync->last_update) * sync->skew / + CLOCKSYNC_SKEW_RESOLUTION; + + return ns_to_ktime(nsec); +} +EXPORT_SYMBOL(clocksync_hw2sys); + +int clocksync_offset(struct clocksync *sync, + s64 *offset, + u64 *hwtstamp) +{ + u64 starthw = 0, endhw = 0; + struct { + s64 offset; + s64 duration_sys; + } buffer[10], sample, *samples; + int counter = 0, i; + int used; + int index; + int num_samples = sync->num_samples; + + if (num_samples > sizeof(buffer)/sizeof(buffer[0])) { + samples = kmalloc(sizeof(*samples) * num_samples, GFP_ATOMIC); + if (!samples) { + samples = buffer; + num_samples = sizeof(buffer)/sizeof(buffer[0]); + } + } else { + samples = buffer; + } + + /* run until we have enough valid samples, but do not try forever */ + i = 0; + counter = 0; + while (1) { + u64 ts; + ktime_t start, end; + + start = sync->systime(); + ts = timecounter_read(sync->clock); + end = sync->systime(); + + if (!i) + starthw = ts; + + /* ignore negative durations */ + sample.duration_sys = ktime_to_ns(ktime_sub(end, start)); + if (sample.duration_sys >= 0) { + /* + * assume symetric delay to and from HW: + * average system time corresponds to measured + * HW time + */ + sample.offset = + ktime_to_ns(ktime_add(end, start)) / 2 - + ts; + + /* simple insertion sort based on duration */ + index = counter - 1; + while (index >= 0) { + if (samples[index].duration_sys < + sample.duration_sys) + break; + samples[index + 1] = samples[index]; + index--; + } + samples[index + 1] = sample; + counter++; + } + + i++; + if (counter >= num_samples || i >= 100000) { + endhw = ts; + break; + } + } + + *hwtstamp = (endhw + starthw) / 2; + + /* remove outliers by only using 75% of the samples */ + used = counter * 3 / 4; + if (!used) + used = counter; + if (used) { + /* calculate average */ + s64 off = 0; + for (index = 0; index < used; index++) + off += samples[index].offset; + *offset = div_s64(off, used); + } + + if (samples && samples != buffer) + kfree(samples); + + return used; +} +EXPORT_SYMBOL(clocksync_offset); + +void __clocksync_update(struct clocksync *sync, + u64 hwtstamp) +{ + s64 offset; + u64 average_time; + + if (!clocksync_offset(sync, &offset, &average_time)) + return; + + printk(KERN_DEBUG + "average offset: %lld\n", offset); + + if (!sync->last_update) { + sync->last_update = average_time; + sync->offset = offset; + sync->skew = 0; + } else { + s64 delta_nsec = average_time - sync->last_update; + + /* avoid division by negative or small deltas */ + if (delta_nsec >= 10000) { + s64 delta_offset_nsec = offset - sync->offset; + s64 skew; /* delta_offset_nsec * + CLOCKSYNC_SKEW_RESOLUTION / + delta_nsec */ + u64 divisor; + + /* div_s64() is limited to 32 bit divisor */ + skew = delta_offset_nsec * CLOCKSYNC_SKEW_RESOLUTION; + divisor = delta_nsec; + while (unlikely(divisor >= ((s64)1) << 32)) { + /* divide both by 2; beware, right shift + of negative value has undefined + behavior and can only be used for + the positive divisor */ + skew = div_s64(skew, 2); + divisor >>= 1; + } + skew = div_s64(skew, divisor); + + /* + * Calculate new overall skew as 4/16 the + * old value and 12/16 the new one. This is + * a rather arbitrary tradeoff between + * only using the latest measurement (0/16 and + * 16/16) and even more weight on past measurements. + */ +#define CLOCKSYNC_NEW_SKEW_PER_16 12 + sync->skew = + div_s64((16 - CLOCKSYNC_NEW_SKEW_PER_16) * + sync->skew + + CLOCKSYNC_NEW_SKEW_PER_16 * skew, + 16); + sync->last_update = average_time; + sync->offset = offset; + } + } +} +EXPORT_SYMBOL(__clocksync_update); -- 1.6.0.4 ^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH NET-NEXT 02/10] time sync: generic infrastructure to map between time stamps generated by a time counter and system time 2009-02-04 13:01 ` [PATCH NET-NEXT 02/10] time sync: generic infrastructure to map between time stamps generated by a time counter and system time Patrick Ohly @ 2009-02-04 19:44 ` john stultz 2009-02-05 10:21 ` Patrick Ohly 0 siblings, 1 reply; 23+ messages in thread From: john stultz @ 2009-02-04 19:44 UTC (permalink / raw) To: Patrick Ohly; +Cc: linux-kernel, netdev, David Miller, Thomas Gleixner On Wed, 2009-02-04 at 14:01 +0100, Patrick Ohly wrote: > Mapping from time counter to system time is implemented. This is sufficient to use > this code in a network device driver which wants to support hardware time stamping > and transformation of hardware time stamps to system time. > > The interface could have been made more versatile by not depending on a time counter, > but this wasn't done to avoid writing glue code elsewhere. > > The method implemented here is the one used and analyzed under the name > "assisted PTP" in the LCI PTP paper: > http://www.linuxclustersinstitute.org/conferences/archive/2008/PDF/Ohly_92221.pdf > --- > include/linux/clocksync.h | 102 +++++++++++++++++++++++ > kernel/time/Makefile | 2 +- > kernel/time/clocksync.c | 197 +++++++++++++++++++++++++++++++++++++++++++++ > 3 files changed, 300 insertions(+), 1 deletions(-) > create mode 100644 include/linux/clocksync.h > create mode 100644 kernel/time/clocksync.c I think my main critique of this somewhat trivial, but still important, as confusion is common in this area. I sort of object to the name clocksync, as you're not really doing anything to sync clocks in the code. One, "clock" is an way overloaded term in the kernel. Two, you're really seem to be just providing deltas and skew rates between notions of time. I want to avoid someone thinking "Oh, NTP must use this code". So maybe something like timecompare.c? If this code is really PTP purposed, maybe ptp should be in the name? > diff --git a/include/linux/clocksync.h b/include/linux/clocksync.h > new file mode 100644 > index 0000000..07c0cc1 > --- /dev/null > +++ b/include/linux/clocksync.h > @@ -0,0 +1,102 @@ > +/* > + * Utility code which helps transforming between hardware time stamps > + * generated by a clocksource and system time. The clocksource is > + * assumed to return monotonically increasing time (but this code does > + * its best to compensate if that is not the case) whereas system time > + * may jump. You're not using clocksources here anymore, right? Probably needs an update. > + * > + * Copyright(c) 2009 Intel Corporation. > + * Author: Patrick Ohly <patrick.ohly@intel.com> > + * > + * This program is free software; you can redistribute it and/or modify it > + * under the terms and conditions of the GNU General Public License, > + * version 2, as published by the Free Software Foundation. > + * > + * This program is distributed in the hope it will be useful, but WITHOUT > + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or > + * FITNESS FOR A PARTICULAR PURPOSE. * See the GNU General Public License for > + * more details. > + * > + * You should have received a copy of the GNU General Public License along with > + * this program; if not, write to the Free Software Foundation, Inc., > + * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA. > + */ > +#ifndef _LINUX_CLOCKSYNC_H > +#define _LINUX_CLOCKSYNC_H > + > +#include <linux/clocksource.h> > +#include <linux/ktime.h> > + > +/** > + * struct clocksync - stores state and configuration for the two clocks > + * > + * Initialize to zero, then set clock, systime, num_samples. > + * > + * Transformation between HW time and system time is done with: So hw time is overloaded as well. It tends to be thought of as the CMOS/RTC clock. Would PTP or NIC time be ok? (It avoids network-time which also has ntp connotations) Or are there other uses for this code other then the PTP code? > + * HW time transformed = HW time + offset + > + * (HW time - last_update) * skew / > + * CLOCKSYNC_SKEW_RESOLUTION > + * > + * @clock: the source for HW time stamps (%clocksource_read_time) nix clocksource. > + * @systime: function returning current system time (ktime_get > + * for monotonic time, or ktime_get_real for wall clock) So, are non-CLOCK_REALTIME clockids actually used? > + * @num_samples: number of times that HW time and system time are to > + * be compared when determining their offset > + * @offset: (system time - HW time) at the time of the last update > + * @skew: average (system time - HW time) / delta HW time * > + * CLOCKSYNC_SKEW_RESOLUTION > + * @last_update: last HW time stamp when clock offset was measured > + */ > +struct clocksync { struct time_comparator { ? > + struct timecounter *clock; > + ktime_t (*systime)(void); > + int num_samples; > + > + s64 offset; > + s64 skew; > + u64 last_update; > +}; > + > +/** > + * clocksync_hw2sys - transform HW time stamp into corresponding system time > + * @sync: context for clock sync > + * @hwtstamp: the result of timecounter_read() or > + * timecounter_cyc2time() > + */ > +extern ktime_t clocksync_hw2sys(struct clocksync *sync, > + u64 hwtstamp); Ugh. hw2sys again is overloaded for reading the cmos/RTC persistent clock and setting the system time. So overall I don't have any objections with the code itself. Its fairly isolated and doesn't interact with the timekeeping code itself. Sorry for taking so long to get feedback to you, I had started looking at this right before the holiday and lost context after the break. thanks -john ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH NET-NEXT 02/10] time sync: generic infrastructure to map between time stamps generated by a time counter and system time 2009-02-04 19:44 ` john stultz @ 2009-02-05 10:21 ` Patrick Ohly 2009-02-09 17:02 ` Patrick Ohly 0 siblings, 1 reply; 23+ messages in thread From: Patrick Ohly @ 2009-02-05 10:21 UTC (permalink / raw) To: john stultz Cc: linux-kernel@vger.kernel.org, netdev@vger.kernel.org, David Miller, Thomas Gleixner On Wed, 2009-02-04 at 21:44 +0200, john stultz wrote: > On Wed, 2009-02-04 at 14:01 +0100, Patrick Ohly wrote: > > Mapping from time counter to system time is implemented. This is sufficient to use > > this code in a network device driver which wants to support hardware time stamping > > and transformation of hardware time stamps to system time. > > > > The interface could have been made more versatile by not depending on a time counter, > > but this wasn't done to avoid writing glue code elsewhere. > > > > The method implemented here is the one used and analyzed under the name > > "assisted PTP" in the LCI PTP paper: > > http://www.linuxclustersinstitute.org/conferences/archive/2008/PDF/Ohly_92221.pdf > > --- > > include/linux/clocksync.h | 102 +++++++++++++++++++++++ > > kernel/time/Makefile | 2 +- > > kernel/time/clocksync.c | 197 +++++++++++++++++++++++++++++++++++++++++++++ > > 3 files changed, 300 insertions(+), 1 deletions(-) > > create mode 100644 include/linux/clocksync.h > > create mode 100644 kernel/time/clocksync.c > > I think my main critique of this somewhat trivial, but still important, > as confusion is common in this area. Agreed, good names are important - choosing them is often harder than getting the code to work ;-) > I sort of object to the name clocksync, as you're not really doing > anything to sync clocks in the code. One, "clock" is an way overloaded > term in the kernel. Two, you're really seem to be just providing deltas > and skew rates between notions of time. I want to avoid someone thinking > "Oh, NTP must use this code". > > So maybe something like timecompare.c? Fine with me. > If this code is really PTP purposed, maybe ptp should be in the name? It's not PTP specific. I'm not sure whether there are other uses for it, but if something comes up, then having PTP in the name would be wrong. So I prefer timecompare. > > diff --git a/include/linux/clocksync.h b/include/linux/clocksync.h > > new file mode 100644 > > index 0000000..07c0cc1 > > --- /dev/null > > +++ b/include/linux/clocksync.h > > @@ -0,0 +1,102 @@ > > +/* > > + * Utility code which helps transforming between hardware time stamps > > + * generated by a clocksource and system time. The clocksource is > > + * assumed to return monotonically increasing time (but this code does > > + * its best to compensate if that is not the case) whereas system time > > + * may jump. > > You're not using clocksources here anymore, right? Probably needs an > update. Right. > > +/** > > + * struct clocksync - stores state and configuration for the two clocks > > + * > > + * Initialize to zero, then set clock, systime, num_samples. > > + * > > + * Transformation between HW time and system time is done with: > > So hw time is overloaded as well. It tends to be thought of as the > CMOS/RTC clock. Would PTP or NIC time be ok? (It avoids network-time > which also has ntp connotations) Or are there other uses for this code > other then the PTP code? As said above, there might be. I should better avoid all references to HW and system and just speak of "source" and "target" time, with just one motivating example given that refers to NIC and system time. > > + * @systime: function returning current system time (ktime_get > > + * for monotonic time, or ktime_get_real for wall clock) > > So, are non-CLOCK_REALTIME clockids actually used? Not at the moment, but I can imagine that this might be useful at some point. > > + * @num_samples: number of times that HW time and system time are to > > + * be compared when determining their offset > > + * @offset: (system time - HW time) at the time of the last update > > + * @skew: average (system time - HW time) / delta HW time * > > + * CLOCKSYNC_SKEW_RESOLUTION > > + * @last_update: last HW time stamp when clock offset was measured > > + */ > > +struct clocksync { > > struct time_comparator { ? Why not simply "timecompare"? It's the central data structure in this module, similar to "clocksource" in "clocksource.[ch]". Apart from that I don't mind using time_comparator. > > + struct timecounter *clock; > > + ktime_t (*systime)(void); > > + int num_samples; > > + > > + s64 offset; > > + s64 skew; > > + u64 last_update; > > +}; > > + > > +/** > > + * clocksync_hw2sys - transform HW time stamp into corresponding system time > > + * @sync: context for clock sync > > + * @hwtstamp: the result of timecounter_read() or > > + * timecounter_cyc2time() > > + */ > > +extern ktime_t clocksync_hw2sys(struct clocksync *sync, > > + u64 hwtstamp); > > Ugh. hw2sys again is overloaded for reading the cmos/RTC persistent > clock and setting the system time. timecompare_transform()? -- Best Regards, Patrick Ohly The content of this message is my personal opinion only and although I am an employee of Intel, the statements I make here in no way represent Intel's position on the issue, nor am I authorized to speak on behalf of Intel on this matter. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH NET-NEXT 02/10] time sync: generic infrastructure to map between time stamps generated by a time counter and system time 2009-02-05 10:21 ` Patrick Ohly @ 2009-02-09 17:02 ` Patrick Ohly 2009-02-09 19:27 ` John Stultz 0 siblings, 1 reply; 23+ messages in thread From: Patrick Ohly @ 2009-02-09 17:02 UTC (permalink / raw) To: john stultz Cc: linux-kernel@vger.kernel.org, netdev@vger.kernel.org, David Miller, Thomas Gleixner On Thu, 2009-02-05 at 10:21 +0000, Patrick Ohly wrote: > On Wed, 2009-02-04 at 21:44 +0200, john stultz wrote: > > On Wed, 2009-02-04 at 14:01 +0100, Patrick Ohly wrote: > > I sort of object to the name clocksync, as you're not really doing > > anything to sync clocks in the code. One, "clock" is an way overloaded > > term in the kernel. Two, you're really seem to be just providing deltas > > and skew rates between notions of time. I want to avoid someone thinking > > "Oh, NTP must use this code". > > > > So maybe something like timecompare.c? > > Fine with me. As there were no other comments I renamed the file, functions and struct accordingly. As I said in my mail, I prefer "struct timecompare" over "struct time_comparator". I also used "timecompare_transform()". Is this revision of the patch okay? How should the two patches get included in the main kernel - via netdev-next-2.6? Bye, Patrick ------------------------------------------------- Subject: [PATCH NET-NEXT 02/10] timecompare: generic infrastructure to map between two time bases Mapping from a struct timecounter to a time returned by functions like ktime_get_real() is implemented. This is sufficient to use this code in a network device driver which wants to support hardware time stamping and transformation of hardware time stamps to system time. The interface could have been made more versatile by not depending on a time counter, but this wasn't done to avoid writing glue code elsewhere. The method implemented here is the one used and analyzed under the name "assisted PTP" in the LCI PTP paper: http://www.linuxclustersinstitute.org/conferences/archive/2008/PDF/Ohly_92221.pdf --- include/linux/timecompare.h | 125 +++++++++++++++++++++++++++ kernel/time/Makefile | 2 +- kernel/time/timecompare.c | 194 +++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 320 insertions(+), 1 deletions(-) create mode 100644 include/linux/timecompare.h create mode 100644 kernel/time/timecompare.c diff --git a/include/linux/timecompare.h b/include/linux/timecompare.h new file mode 100644 index 0000000..f88c454 --- /dev/null +++ b/include/linux/timecompare.h @@ -0,0 +1,125 @@ +/* + * Utility code which helps transforming between two different time + * bases, called "source" and "target" time in this code. + * + * Source time has to be provided via the timecounter API while target + * time is accessed via a function callback whose prototype + * intentionally matches ktime_get() and ktime_get_real(). These + * interfaces where chosen like this so that the code serves its + * initial purpose without additional glue code. + * + * This purpose is synchronizing a hardware clock in a NIC with system + * time, in order to implement the Precision Time Protocol (PTP, + * IEEE1588) with more accurate hardware assisted time stamping. In + * that context only synchronization against system time (= + * ktime_get_real()) is currently needed. But this utility code might + * become useful in other situations, which is why it was written as + * general purpose utility code. + * + * The source timecounter is assumed to return monotonically + * increasing time (but this code does its best to compensate if that + * is not the case) whereas target time may jump. + * + * The target time corresponding to a source time is determined by + * reading target time, reading source time, reading target time + * again, then assuming that average target time corresponds to source + * time. In other words, the assumption is that reading the source + * time is slow and involves equal time for sending the request and + * receiving the reply, whereas reading target time is assumed to be + * fast. + * + * Copyright(c) 2009 Intel Corporation. + * Author: Patrick Ohly <patrick.ohly@intel.com> + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. * See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program; if not, write to the Free Software Foundation, Inc., + * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA. + */ +#ifndef _LINUX_TIMECOMPARE_H +#define _LINUX_TIMECOMPARE_H + +#include <linux/clocksource.h> +#include <linux/ktime.h> + +/** + * struct timecompare - stores state and configuration for the two clocks + * + * Initialize to zero, then set source/target/num_samples. + * + * Transformation between source time and target time is done with: + * target_time = source_time + offset + + * (source_time - last_update) * skew / + * TIMECOMPARE_SKEW_RESOLUTION + * + * @source: used to get source time stamps via timecounter_read() + * @target: function returning target time (for example, ktime_get + * for monotonic time, or ktime_get_real for wall clock) + * @num_samples: number of times that source time and target time are to + * be compared when determining their offset + * @offset: (target time - source time) at the time of the last update + * @skew: average (target time - source time) / delta source time * + * TIMECOMPARE_SKEW_RESOLUTION + * @last_update: last source time stamp when time offset was measured + */ +struct timecompare { + struct timecounter *source; + ktime_t (*target)(void); + int num_samples; + + s64 offset; + s64 skew; + u64 last_update; +}; + +/** + * timecompare_transform - transform source time stamp into target time base + * @sync: context for time sync + * @source_tstamp: the result of timecounter_read() or + * timecounter_cyc2time() + */ +extern ktime_t timecompare_transform(struct timecompare *sync, + u64 source_tstamp); + +/** + * timecompare_offset - measure current (target time - source time) offset + * @sync: context for time sync + * @offset: average offset during sample period returned here + * @source_tstamp: average source time during sample period returned here + * + * Returns number of samples used. Might be zero (= no result) in the + * unlikely case that target time was monotonically decreasing for all + * samples (= broken). + */ +extern int timecompare_offset(struct timecompare *sync, + s64 *offset, + u64 *source_tstamp); + +extern void __timecompare_update(struct timecompare *sync, + u64 source_tstamp); + +/** + * timecompare_update - update offset and skew by measuring current offset + * @sync: context for time sync + * @source_tstamp: the result of timecounter_read() or + * timecounter_cyc2time(), pass zero to force update + * + * Updates are only done at most once per second. + */ +static inline void timecompare_update(struct timecompare *sync, + u64 source_tstamp) +{ + if (!source_tstamp || + (s64)(source_tstamp - sync->last_update) >= NSEC_PER_SEC) + __timecompare_update(sync, source_tstamp); +} + +#endif /* _LINUX_TIMECOMPARE_H */ diff --git a/kernel/time/Makefile b/kernel/time/Makefile index 905b0b5..0b0a636 100644 --- a/kernel/time/Makefile +++ b/kernel/time/Makefile @@ -1,4 +1,4 @@ -obj-y += timekeeping.o ntp.o clocksource.o jiffies.o timer_list.o +obj-y += timekeeping.o ntp.o clocksource.o jiffies.o timer_list.o timecompare.o obj-$(CONFIG_GENERIC_CLOCKEVENTS_BUILD) += clockevents.o obj-$(CONFIG_GENERIC_CLOCKEVENTS) += tick-common.o diff --git a/kernel/time/timecompare.c b/kernel/time/timecompare.c new file mode 100644 index 0000000..1e94abc --- /dev/null +++ b/kernel/time/timecompare.c @@ -0,0 +1,194 @@ +/* + * Copyright (C) 2009 Intel Corporation. + * Author: Patrick Ohly <patrick.ohly@intel.com> + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + */ + +#include <linux/timecompare.h> +#include <linux/module.h> +#include <linux/math64.h> + +/* + * fixed point arithmetic scale factor for skew + * + * Usually one would measure skew in ppb (parts per billion, 1e9), but + * using a factor of 2 simplifies the math. + */ +#define TIMECOMPARE_SKEW_RESOLUTION (((s64)1)<<30) + +ktime_t timecompare_transform(struct timecompare *sync, + u64 source_tstamp) +{ + u64 nsec; + + nsec = source_tstamp + sync->offset; + nsec += (s64)(source_tstamp - sync->last_update) * sync->skew / + TIMECOMPARE_SKEW_RESOLUTION; + + return ns_to_ktime(nsec); +} +EXPORT_SYMBOL(timecompare_transform); + +int timecompare_offset(struct timecompare *sync, + s64 *offset, + u64 *source_tstamp) +{ + u64 start_source = 0, end_source = 0; + struct { + s64 offset; + s64 duration_target; + } buffer[10], sample, *samples; + int counter = 0, i; + int used; + int index; + int num_samples = sync->num_samples; + + if (num_samples > sizeof(buffer)/sizeof(buffer[0])) { + samples = kmalloc(sizeof(*samples) * num_samples, GFP_ATOMIC); + if (!samples) { + samples = buffer; + num_samples = sizeof(buffer)/sizeof(buffer[0]); + } + } else { + samples = buffer; + } + + /* run until we have enough valid samples, but do not try forever */ + i = 0; + counter = 0; + while (1) { + u64 ts; + ktime_t start, end; + + start = sync->target(); + ts = timecounter_read(sync->source); + end = sync->target(); + + if (!i) + start_source = ts; + + /* ignore negative durations */ + sample.duration_target = ktime_to_ns(ktime_sub(end, start)); + if (sample.duration_target >= 0) { + /* + * assume symetric delay to and from source: + * average target time corresponds to measured + * source time + */ + sample.offset = + ktime_to_ns(ktime_add(end, start)) / 2 - + ts; + + /* simple insertion sort based on duration */ + index = counter - 1; + while (index >= 0) { + if (samples[index].duration_target < + sample.duration_target) + break; + samples[index + 1] = samples[index]; + index--; + } + samples[index + 1] = sample; + counter++; + } + + i++; + if (counter >= num_samples || i >= 100000) { + end_source = ts; + break; + } + } + + *source_tstamp = (end_source + start_source) / 2; + + /* remove outliers by only using 75% of the samples */ + used = counter * 3 / 4; + if (!used) + used = counter; + if (used) { + /* calculate average */ + s64 off = 0; + for (index = 0; index < used; index++) + off += samples[index].offset; + *offset = div_s64(off, used); + } + + if (samples && samples != buffer) + kfree(samples); + + return used; +} +EXPORT_SYMBOL(timecompare_offset); + +void __timecompare_update(struct timecompare *sync, + u64 source_tstamp) +{ + s64 offset; + u64 average_time; + + if (!timecompare_offset(sync, &offset, &average_time)) + return; + + printk(KERN_DEBUG + "average offset: %lld\n", offset); + + if (!sync->last_update) { + sync->last_update = average_time; + sync->offset = offset; + sync->skew = 0; + } else { + s64 delta_nsec = average_time - sync->last_update; + + /* avoid division by negative or small deltas */ + if (delta_nsec >= 10000) { + s64 delta_offset_nsec = offset - sync->offset; + s64 skew; /* delta_offset_nsec * + TIMECOMPARE_SKEW_RESOLUTION / + delta_nsec */ + u64 divisor; + + /* div_s64() is limited to 32 bit divisor */ + skew = delta_offset_nsec * TIMECOMPARE_SKEW_RESOLUTION; + divisor = delta_nsec; + while (unlikely(divisor >= ((s64)1) << 32)) { + /* divide both by 2; beware, right shift + of negative value has undefined + behavior and can only be used for + the positive divisor */ + skew = div_s64(skew, 2); + divisor >>= 1; + } + skew = div_s64(skew, divisor); + + /* + * Calculate new overall skew as 4/16 the + * old value and 12/16 the new one. This is + * a rather arbitrary tradeoff between + * only using the latest measurement (0/16 and + * 16/16) and even more weight on past measurements. + */ +#define TIMECOMPARE_NEW_SKEW_PER_16 12 + sync->skew = + div_s64((16 - TIMECOMPARE_NEW_SKEW_PER_16) * + sync->skew + + TIMECOMPARE_NEW_SKEW_PER_16 * skew, + 16); + sync->last_update = average_time; + sync->offset = offset; + } + } +} +EXPORT_SYMBOL(__timecompare_update); -- 1.6.1.2 -- Best Regards, Patrick Ohly The content of this message is my personal opinion only and although I am an employee of Intel, the statements I make here in no way represent Intel's position on the issue, nor am I authorized to speak on behalf of Intel on this matter. ^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH NET-NEXT 02/10] time sync: generic infrastructure to map between time stamps generated by a time counter and system time 2009-02-09 17:02 ` Patrick Ohly @ 2009-02-09 19:27 ` John Stultz 2009-02-09 21:46 ` Patrick Ohly 2009-02-09 22:57 ` David Miller 0 siblings, 2 replies; 23+ messages in thread From: John Stultz @ 2009-02-09 19:27 UTC (permalink / raw) To: Patrick Ohly Cc: linux-kernel@vger.kernel.org, netdev@vger.kernel.org, David Miller, Thomas Gleixner On Mon, 2009-02-09 at 18:02 +0100, Patrick Ohly wrote: > On Thu, 2009-02-05 at 10:21 +0000, Patrick Ohly wrote: > > On Wed, 2009-02-04 at 21:44 +0200, john stultz wrote: > > > On Wed, 2009-02-04 at 14:01 +0100, Patrick Ohly wrote: > > > I sort of object to the name clocksync, as you're not really doing > > > anything to sync clocks in the code. One, "clock" is an way overloaded > > > term in the kernel. Two, you're really seem to be just providing deltas > > > and skew rates between notions of time. I want to avoid someone thinking > > > "Oh, NTP must use this code". > > > > > > So maybe something like timecompare.c? > > > > Fine with me. > > As there were no other comments I renamed the file, functions and struct > accordingly. As I said in my mail, I prefer "struct timecompare" over > "struct time_comparator". I also used "timecompare_transform()". > > Is this revision of the patch okay? How should the two patches get > included in the main kernel - via netdev-next-2.6? > > Bye, Patrick Small comment below, but otherwise it looks ok to me. I usually push patches through Andrew, so I'd probably go that way. But I'd leave it to Dave if he's comfortable pushing them to Linus. Acked-by: John Stultz <johnstul@us.ibm.com> > ------------------------------------------------- > > Subject: [PATCH NET-NEXT 02/10] timecompare: generic infrastructure to map between two time bases > > Mapping from a struct timecounter to a time returned by functions like > ktime_get_real() is implemented. This is sufficient to use this code > in a network device driver which wants to support hardware time > stamping and transformation of hardware time stamps to system time. > > The interface could have been made more versatile by not depending on > a time counter, but this wasn't done to avoid writing glue code > elsewhere. > > The method implemented here is the one used and analyzed under the name > "assisted PTP" in the LCI PTP paper: > http://www.linuxclustersinstitute.org/conferences/archive/2008/PDF/Ohly_92221.pdf > --- > include/linux/timecompare.h | 125 +++++++++++++++++++++++++++ > kernel/time/Makefile | 2 +- > kernel/time/timecompare.c | 194 +++++++++++++++++++++++++++++++++++++++++++ > 3 files changed, 320 insertions(+), 1 deletions(-) > create mode 100644 include/linux/timecompare.h > create mode 100644 kernel/time/timecompare.c > > diff --git a/include/linux/timecompare.h b/include/linux/timecompare.h > new file mode 100644 > index 0000000..f88c454 > --- /dev/null > +++ b/include/linux/timecompare.h > @@ -0,0 +1,125 @@ > +/* > + * Utility code which helps transforming between two different time > + * bases, called "source" and "target" time in this code. > + * > + * Source time has to be provided via the timecounter API while target > + * time is accessed via a function callback whose prototype > + * intentionally matches ktime_get() and ktime_get_real(). These > + * interfaces where chosen like this so that the code serves its > + * initial purpose without additional glue code. > + * > + * This purpose is synchronizing a hardware clock in a NIC with system > + * time, in order to implement the Precision Time Protocol (PTP, > + * IEEE1588) with more accurate hardware assisted time stamping. In > + * that context only synchronization against system time (= > + * ktime_get_real()) is currently needed. But this utility code might > + * become useful in other situations, which is why it was written as > + * general purpose utility code. > + * > + * The source timecounter is assumed to return monotonically > + * increasing time (but this code does its best to compensate if that > + * is not the case) whereas target time may jump. > + * > + * The target time corresponding to a source time is determined by > + * reading target time, reading source time, reading target time > + * again, then assuming that average target time corresponds to source > + * time. In other words, the assumption is that reading the source > + * time is slow and involves equal time for sending the request and > + * receiving the reply, whereas reading target time is assumed to be > + * fast. > + * > + * Copyright(c) 2009 Intel Corporation. > + * Author: Patrick Ohly <patrick.ohly@intel.com> > + * > + * This program is free software; you can redistribute it and/or modify it > + * under the terms and conditions of the GNU General Public License, > + * version 2, as published by the Free Software Foundation. > + * > + * This program is distributed in the hope it will be useful, but WITHOUT > + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or > + * FITNESS FOR A PARTICULAR PURPOSE. * See the GNU General Public License for > + * more details. > + * > + * You should have received a copy of the GNU General Public License along with > + * this program; if not, write to the Free Software Foundation, Inc., > + * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA. > + */ > +#ifndef _LINUX_TIMECOMPARE_H > +#define _LINUX_TIMECOMPARE_H > + > +#include <linux/clocksource.h> > +#include <linux/ktime.h> > + > +/** > + * struct timecompare - stores state and configuration for the two clocks > + * > + * Initialize to zero, then set source/target/num_samples. > + * > + * Transformation between source time and target time is done with: > + * target_time = source_time + offset + > + * (source_time - last_update) * skew / > + * TIMECOMPARE_SKEW_RESOLUTION > + * > + * @source: used to get source time stamps via timecounter_read() > + * @target: function returning target time (for example, ktime_get > + * for monotonic time, or ktime_get_real for wall clock) > + * @num_samples: number of times that source time and target time are to > + * be compared when determining their offset > + * @offset: (target time - source time) at the time of the last update > + * @skew: average (target time - source time) / delta source time * > + * TIMECOMPARE_SKEW_RESOLUTION > + * @last_update: last source time stamp when time offset was measured > + */ > +struct timecompare { > + struct timecounter *source; > + ktime_t (*target)(void); > + int num_samples; > + > + s64 offset; > + s64 skew; > + u64 last_update; > +}; > + > +/** > + * timecompare_transform - transform source time stamp into target time base > + * @sync: context for time sync > + * @source_tstamp: the result of timecounter_read() or > + * timecounter_cyc2time() > + */ > +extern ktime_t timecompare_transform(struct timecompare *sync, > + u64 source_tstamp); > + > +/** > + * timecompare_offset - measure current (target time - source time) offset > + * @sync: context for time sync > + * @offset: average offset during sample period returned here > + * @source_tstamp: average source time during sample period returned here > + * > + * Returns number of samples used. Might be zero (= no result) in the > + * unlikely case that target time was monotonically decreasing for all > + * samples (= broken). > + */ > +extern int timecompare_offset(struct timecompare *sync, > + s64 *offset, > + u64 *source_tstamp); > + > +extern void __timecompare_update(struct timecompare *sync, > + u64 source_tstamp); > + > +/** > + * timecompare_update - update offset and skew by measuring current offset > + * @sync: context for time sync > + * @source_tstamp: the result of timecounter_read() or > + * timecounter_cyc2time(), pass zero to force update > + * > + * Updates are only done at most once per second. > + */ > +static inline void timecompare_update(struct timecompare *sync, > + u64 source_tstamp) > +{ > + if (!source_tstamp || > + (s64)(source_tstamp - sync->last_update) >= NSEC_PER_SEC) > + __timecompare_update(sync, source_tstamp); > +} > + > +#endif /* _LINUX_TIMECOMPARE_H */ > diff --git a/kernel/time/Makefile b/kernel/time/Makefile > index 905b0b5..0b0a636 100644 > --- a/kernel/time/Makefile > +++ b/kernel/time/Makefile > @@ -1,4 +1,4 @@ > -obj-y += timekeeping.o ntp.o clocksource.o jiffies.o timer_list.o > +obj-y += timekeeping.o ntp.o clocksource.o jiffies.o timer_list.o timecompare.o > > obj-$(CONFIG_GENERIC_CLOCKEVENTS_BUILD) += clockevents.o > obj-$(CONFIG_GENERIC_CLOCKEVENTS) += tick-common.o > diff --git a/kernel/time/timecompare.c b/kernel/time/timecompare.c > new file mode 100644 > index 0000000..1e94abc > --- /dev/null > +++ b/kernel/time/timecompare.c > @@ -0,0 +1,194 @@ > +/* > + * Copyright (C) 2009 Intel Corporation. > + * Author: Patrick Ohly <patrick.ohly@intel.com> > + * > + * This program is free software; you can redistribute it and/or modify > + * it under the terms of the GNU General Public License as published by > + * the Free Software Foundation; either version 2 of the License, or > + * (at your option) any later version. > + * > + * This program is distributed in the hope that it will be useful, > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > + * GNU General Public License for more details. > + * > + * You should have received a copy of the GNU General Public License > + * along with this program; if not, write to the Free Software > + * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. > + */ > + > +#include <linux/timecompare.h> > +#include <linux/module.h> > +#include <linux/math64.h> > + > +/* > + * fixed point arithmetic scale factor for skew > + * > + * Usually one would measure skew in ppb (parts per billion, 1e9), but > + * using a factor of 2 simplifies the math. > + */ > +#define TIMECOMPARE_SKEW_RESOLUTION (((s64)1)<<30) > + > +ktime_t timecompare_transform(struct timecompare *sync, > + u64 source_tstamp) > +{ > + u64 nsec; > + > + nsec = source_tstamp + sync->offset; > + nsec += (s64)(source_tstamp - sync->last_update) * sync->skew / > + TIMECOMPARE_SKEW_RESOLUTION; > + > + return ns_to_ktime(nsec); > +} > +EXPORT_SYMBOL(timecompare_transform); > + > +int timecompare_offset(struct timecompare *sync, > + s64 *offset, > + u64 *source_tstamp) > +{ > + u64 start_source = 0, end_source = 0; > + struct { > + s64 offset; > + s64 duration_target; > + } buffer[10], sample, *samples; > + int counter = 0, i; > + int used; > + int index; > + int num_samples = sync->num_samples; > + > + if (num_samples > sizeof(buffer)/sizeof(buffer[0])) { > + samples = kmalloc(sizeof(*samples) * num_samples, GFP_ATOMIC); > + if (!samples) { > + samples = buffer; > + num_samples = sizeof(buffer)/sizeof(buffer[0]); > + } > + } else { > + samples = buffer; > + } > + > + /* run until we have enough valid samples, but do not try forever */ > + i = 0; > + counter = 0; > + while (1) { > + u64 ts; > + ktime_t start, end; > + > + start = sync->target(); > + ts = timecounter_read(sync->source); > + end = sync->target(); > + > + if (!i) > + start_source = ts; > + > + /* ignore negative durations */ > + sample.duration_target = ktime_to_ns(ktime_sub(end, start)); > + if (sample.duration_target >= 0) { You may also want to checking the bounds on the duration_target. If preemption hits and the values are too out of whack, the symetric delay assumption below might be quite invalid. I guess the outliers removal probably covers this as well, but seems some sanity checking might be good. > + /* > + * assume symetric delay to and from source: > + * average target time corresponds to measured > + * source time > + */ > + sample.offset = > + ktime_to_ns(ktime_add(end, start)) / 2 - > + ts; > + > + /* simple insertion sort based on duration */ > + index = counter - 1; > + while (index >= 0) { > + if (samples[index].duration_target < > + sample.duration_target) > + break; > + samples[index + 1] = samples[index]; > + index--; > + } > + samples[index + 1] = sample; > + counter++; > + } > + > + i++; > + if (counter >= num_samples || i >= 100000) { > + end_source = ts; > + break; > + } > + } > + > + *source_tstamp = (end_source + start_source) / 2; > + > + /* remove outliers by only using 75% of the samples */ > + used = counter * 3 / 4; > + if (!used) > + used = counter; > + if (used) { > + /* calculate average */ > + s64 off = 0; > + for (index = 0; index < used; index++) > + off += samples[index].offset; > + *offset = div_s64(off, used); > + } > + > + if (samples && samples != buffer) > + kfree(samples); > + > + return used; > +} > +EXPORT_SYMBOL(timecompare_offset); > + > +void __timecompare_update(struct timecompare *sync, > + u64 source_tstamp) > +{ > + s64 offset; > + u64 average_time; > + > + if (!timecompare_offset(sync, &offset, &average_time)) > + return; > + > + printk(KERN_DEBUG > + "average offset: %lld\n", offset); > + > + if (!sync->last_update) { > + sync->last_update = average_time; > + sync->offset = offset; > + sync->skew = 0; > + } else { > + s64 delta_nsec = average_time - sync->last_update; > + > + /* avoid division by negative or small deltas */ > + if (delta_nsec >= 10000) { > + s64 delta_offset_nsec = offset - sync->offset; > + s64 skew; /* delta_offset_nsec * > + TIMECOMPARE_SKEW_RESOLUTION / > + delta_nsec */ > + u64 divisor; > + > + /* div_s64() is limited to 32 bit divisor */ > + skew = delta_offset_nsec * TIMECOMPARE_SKEW_RESOLUTION; > + divisor = delta_nsec; > + while (unlikely(divisor >= ((s64)1) << 32)) { > + /* divide both by 2; beware, right shift > + of negative value has undefined > + behavior and can only be used for > + the positive divisor */ > + skew = div_s64(skew, 2); > + divisor >>= 1; > + } > + skew = div_s64(skew, divisor); > + > + /* > + * Calculate new overall skew as 4/16 the > + * old value and 12/16 the new one. This is > + * a rather arbitrary tradeoff between > + * only using the latest measurement (0/16 and > + * 16/16) and even more weight on past measurements. > + */ > +#define TIMECOMPARE_NEW_SKEW_PER_16 12 > + sync->skew = > + div_s64((16 - TIMECOMPARE_NEW_SKEW_PER_16) * > + sync->skew + > + TIMECOMPARE_NEW_SKEW_PER_16 * skew, > + 16); > + sync->last_update = average_time; > + sync->offset = offset; > + } > + } > +} > +EXPORT_SYMBOL(__timecompare_update); > -- > 1.6.1.2 > ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH NET-NEXT 02/10] time sync: generic infrastructure to map between time stamps generated by a time counter and system time 2009-02-09 19:27 ` John Stultz @ 2009-02-09 21:46 ` Patrick Ohly 2009-02-09 21:54 ` John Stultz 2009-02-09 22:57 ` David Miller 1 sibling, 1 reply; 23+ messages in thread From: Patrick Ohly @ 2009-02-09 21:46 UTC (permalink / raw) To: John Stultz Cc: linux-kernel@vger.kernel.org, netdev@vger.kernel.org, David Miller, Thomas Gleixner On Mon, 2009-02-09 at 21:27 +0200, John Stultz wrote: > On Mon, 2009-02-09 at 18:02 +0100, Patrick Ohly wrote: > > Is this revision of the patch okay? How should the two patches get > > included in the main kernel - via netdev-next-2.6? > > Small comment below, but otherwise it looks ok to me. I usually push > patches through Andrew, so I'd probably go that way. But I'd leave it to > Dave if he's comfortable pushing them to Linus. As you don't mind, I suggest to push through Dave as part of the complete patch series. That way we don't need to worry about coordinating two subtrees. > Acked-by: John Stultz <johnstul@us.ibm.com> Thanks! Will add that. > > + /* ignore negative durations */ > > + sample.duration_target = ktime_to_ns(ktime_sub(end, start)); > > + if (sample.duration_target >= 0) { > > You may also want to checking the bounds on the duration_target. If > preemption hits and the values are too out of whack, the symetric delay > assumption below might be quite invalid. > > I guess the outliers removal probably covers this as well, but seems > some sanity checking might be good. That would require more information, like "duration_target is usually in the xxx-yyy range". This could be determined based on past measurements or the median of the current sample set, but is this really better than the current "remove longest 25%"? In practice I haven't seen such a problem, therefore I'd prefer to keep the code simple and not change it. It was tested under load conditions (both CPU and network). -- Bye, Patrick Ohly -- Patrick.Ohly@gmx.de http://www.estamos.de/ ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH NET-NEXT 02/10] time sync: generic infrastructure to map between time stamps generated by a time counter and system time 2009-02-09 21:46 ` Patrick Ohly @ 2009-02-09 21:54 ` John Stultz 0 siblings, 0 replies; 23+ messages in thread From: John Stultz @ 2009-02-09 21:54 UTC (permalink / raw) To: Patrick Ohly Cc: linux-kernel@vger.kernel.org, netdev@vger.kernel.org, David Miller, Thomas Gleixner On Mon, 2009-02-09 at 22:46 +0100, Patrick Ohly wrote: > On Mon, 2009-02-09 at 21:27 +0200, John Stultz wrote: > > On Mon, 2009-02-09 at 18:02 +0100, Patrick Ohly wrote: > > > Is this revision of the patch okay? How should the two patches get > > > included in the main kernel - via netdev-next-2.6? > > > > Small comment below, but otherwise it looks ok to me. I usually push > > patches through Andrew, so I'd probably go that way. But I'd leave it to > > Dave if he's comfortable pushing them to Linus. > > As you don't mind, I suggest to push through Dave as part of the > complete patch series. That way we don't need to worry about > coordinating two subtrees. > > > Acked-by: John Stultz <johnstul@us.ibm.com> > > Thanks! Will add that. > > > > + /* ignore negative durations */ > > > + sample.duration_target = ktime_to_ns(ktime_sub(end, start)); > > > + if (sample.duration_target >= 0) { > > > > You may also want to checking the bounds on the duration_target. If > > preemption hits and the values are too out of whack, the symetric delay > > assumption below might be quite invalid. > > > > I guess the outliers removal probably covers this as well, but seems > > some sanity checking might be good. > > That would require more information, like "duration_target is usually in > the xxx-yyy range". This could be determined based on past measurements > or the median of the current sample set, but is this really better than > the current "remove longest 25%"? > > In practice I haven't seen such a problem, therefore I'd prefer to keep > the code simple and not change it. It was tested under load conditions > (both CPU and network). Ok. I was just thinking more along the lines of "throw out duration times longer then 25us" or 100us or something relatively sane like that. Again, the largest 25% will probably cover it, but I'm just looking at this with the realtime preemption patches in mind. thanks -john ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH NET-NEXT 02/10] time sync: generic infrastructure to map between time stamps generated by a time counter and system time 2009-02-09 19:27 ` John Stultz 2009-02-09 21:46 ` Patrick Ohly @ 2009-02-09 22:57 ` David Miller 1 sibling, 0 replies; 23+ messages in thread From: David Miller @ 2009-02-09 22:57 UTC (permalink / raw) To: johnstul; +Cc: patrick.ohly, linux-kernel, netdev, tglx From: John Stultz <johnstul@us.ibm.com> Date: Mon, 09 Feb 2009 11:27:47 -0800 > On Mon, 2009-02-09 at 18:02 +0100, Patrick Ohly wrote: > > On Thu, 2009-02-05 at 10:21 +0000, Patrick Ohly wrote: > > > On Wed, 2009-02-04 at 21:44 +0200, john stultz wrote: > > > > On Wed, 2009-02-04 at 14:01 +0100, Patrick Ohly wrote: > > > > I sort of object to the name clocksync, as you're not really doing > > > > anything to sync clocks in the code. One, "clock" is an way overloaded > > > > term in the kernel. Two, you're really seem to be just providing deltas > > > > and skew rates between notions of time. I want to avoid someone thinking > > > > "Oh, NTP must use this code". > > > > > > > > So maybe something like timecompare.c? > > > > > > Fine with me. > > > > As there were no other comments I renamed the file, functions and struct > > accordingly. As I said in my mail, I prefer "struct timecompare" over > > "struct time_comparator". I also used "timecompare_transform()". > > > > Is this revision of the patch okay? How should the two patches get > > included in the main kernel - via netdev-next-2.6? > > > > Bye, Patrick > > Small comment below, but otherwise it looks ok to me. I usually push > patches through Andrew, so I'd probably go that way. But I'd leave it to > Dave if he's comfortable pushing them to Linus. > > Acked-by: John Stultz <johnstul@us.ibm.com> Patrick, please submit these two changes fresh now that all of the issues are resolved and you have the ACKs :-) I'll queue them up in net-next-2.6 ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH NET-NEXT 01/10] clocksource: allow usage independent of timekeeping.c 2009-02-04 13:01 ` [PATCH NET-NEXT 01/10] clocksource: allow usage independent of timekeeping.c Patrick Ohly 2009-02-04 13:01 ` [PATCH NET-NEXT 02/10] time sync: generic infrastructure to map between time stamps generated by a time counter and system time Patrick Ohly @ 2009-02-04 14:03 ` Daniel Walker 2009-02-04 14:46 ` Patrick Ohly 1 sibling, 1 reply; 23+ messages in thread From: Daniel Walker @ 2009-02-04 14:03 UTC (permalink / raw) To: Patrick Ohly Cc: linux-kernel, netdev, David Miller, John Stultz, Thomas Gleixner On Wed, 2009-02-04 at 14:01 +0100, Patrick Ohly wrote: > /** > + * struct cyclecounter - hardware abstraction for a free running counter > + * Provides completely state-free accessors to the underlying hardware. > + * Depending on which hardware it reads, the cycle counter may wrap > + * around quickly. Locking rules (if necessary) have to be defined > + * by the implementor and user of specific instances of this API. > + * > + * @read: returns the current cycle value > + * @mask: bitmask for two's complement > + * subtraction of non 64 bit counters, > + * see CLOCKSOURCE_MASK() helper macro > + * @mult: cycle to nanosecond multiplier > + * @shift: cycle to nanosecond divisor (power of two) > + */ > +struct cyclecounter { > + cycle_t (*read)(const struct cyclecounter *cc); > + cycle_t mask; > + u32 mult; > + u32 shift; > +}; Where are these defined? I don't see any in created in your code. > +/** > + * struct timecounter - layer above a %struct cyclecounter which counts nanoseconds > + * Contains the state needed by timecounter_read() to detect > + * cycle counter wrap around. Initialize with > + * timecounter_init(). Also used to convert cycle counts into the > + * corresponding nanosecond counts with timecounter_cyc2time(). Users > + * of this code are responsible for initializing the underlying > + * cycle counter hardware, locking issues and reading the time > + * more often than the cycle counter wraps around. The nanosecond > + * counter will only wrap around after ~585 years. > + * > + * @cc: the cycle counter used by this instance > + * @cycle_last: most recent cycle counter value seen by > + * timecounter_read() > + * @nsec: continuously increasing count > + */ > +struct timecounter { > + const struct cyclecounter *cc; > + cycle_t cycle_last; > + u64 nsec; > +}; If your mixing generic and non-generic code, it seems a little presumptuous to assume the code would get called more often than the counter wraps. If this cyclecounter is what I think it is (a clocksource) they wrap at varied times. > +/** > + * cyclecounter_cyc2ns - converts cycle counter cycles to nanoseconds > + * @tc: Pointer to cycle counter. > + * @cycles: Cycles > + * > + * XXX - This could use some mult_lxl_ll() asm optimization. Same code > + * as in cyc2ns, but with unsigned result. > + */ > +static inline u64 cyclecounter_cyc2ns(const struct cyclecounter *cc, > + cycle_t cycles) > +{ > + u64 ret = (u64)cycles; > + ret = (ret * cc->mult) >> cc->shift; > + return ret; > +} This is just outright duplication.. Why wouldn't you use the function that already exists for this? > +/** > + * clocksource_read_ns - get nanoseconds since last call of this function > + * @tc: Pointer to time counter > + * > + * When the underlying cycle counter runs over, this will be handled > + * correctly as long as it does not run over more than once between > + * calls. > + * > + * The first call to this function for a new time counter initializes > + * the time tracking and returns bogus results. > + */ "bogus results" doesn't sound very pretty.. Daniel ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH NET-NEXT 01/10] clocksource: allow usage independent of timekeeping.c 2009-02-04 14:03 ` [PATCH NET-NEXT 01/10] clocksource: allow usage independent of timekeeping.c Daniel Walker @ 2009-02-04 14:46 ` Patrick Ohly 2009-02-04 15:09 ` Daniel Walker 0 siblings, 1 reply; 23+ messages in thread From: Patrick Ohly @ 2009-02-04 14:46 UTC (permalink / raw) To: Daniel Walker Cc: linux-kernel@vger.kernel.org, netdev@vger.kernel.org, David Miller, John Stultz, Thomas Gleixner Hi Daniel! Thanks for looking at this. I think the previous discussion of this patch under the same subject on LKML already addressed your concerns, but I'll also reply in detail below. On Wed, 2009-02-04 at 14:03 +0000, Daniel Walker wrote: > On Wed, 2009-02-04 at 14:01 +0100, Patrick Ohly wrote: > > > /** > > + * struct cyclecounter - hardware abstraction for a free running counter > > + * Provides completely state-free accessors to the underlying hardware. > > + * Depending on which hardware it reads, the cycle counter may wrap > > + * around quickly. Locking rules (if necessary) have to be defined > > + * by the implementor and user of specific instances of this API. > > + * > > + * @read: returns the current cycle value > > + * @mask: bitmask for two's complement > > + * subtraction of non 64 bit counters, > > + * see CLOCKSOURCE_MASK() helper macro > > + * @mult: cycle to nanosecond multiplier > > + * @shift: cycle to nanosecond divisor (power of two) > > + */ > > +struct cyclecounter { > > + cycle_t (*read)(const struct cyclecounter *cc); > > + cycle_t mask; > > + u32 mult; > > + u32 shift; > > +}; > > Where are these defined? I don't see any in created in your code. What do you mean with "these"? cycle_t? That type is defined in clocksource.h. This patch intentionally adds these definitions to the existing file because of the large conceptual overlap. In an earlier revision of the patch I had adapted clocksource itself so that it could be used outside of the time keeping code; John wanted me to use these smaller structs instead that you now find in the current patch. Eventually John wants to refactor clocksource so that it uses them and just adds additional elements in clocksource. Right now clocksource is a mixture of different concepts. Breaking out cyclecounter and timecounter is a first step towards that cleanup. > > +/** > > + * struct timecounter - layer above a %struct cyclecounter which counts nanoseconds > > + * Contains the state needed by timecounter_read() to detect > > + * cycle counter wrap around. Initialize with > > + * timecounter_init(). Also used to convert cycle counts into the > > + * corresponding nanosecond counts with timecounter_cyc2time(). Users > > + * of this code are responsible for initializing the underlying > > + * cycle counter hardware, locking issues and reading the time > > + * more often than the cycle counter wraps around. The nanosecond > > + * counter will only wrap around after ~585 years. > > + * > > + * @cc: the cycle counter used by this instance > > + * @cycle_last: most recent cycle counter value seen by > > + * timecounter_read() > > + * @nsec: continuously increasing count > > + */ > > +struct timecounter { > > + const struct cyclecounter *cc; > > + cycle_t cycle_last; > > + u64 nsec; > > +}; > > If your mixing generic and non-generic code, it seems a little > presumptuous to assume the code would get called more often than the > counter wraps. If this cyclecounter is what I think it is (a > clocksource) they wrap at varied times. timecounter and cyclecounter will have the same owner, so that owner knows how often the cycle counter will run over and can use it accordingly. The cyclecounter is a pointer and not just a an instance of cyclecounter so that the owner can pick an instance of a timecounter which has additional private data attached to it. The initial user of the new code will have a 64 bit hardware register as cycle counter. We decided to leave the "counter runs over too often" problem unsolved until we actually have hardware which exhibits the problem. > > +/** > > + * cyclecounter_cyc2ns - converts cycle counter cycles to nanoseconds > > + * @tc: Pointer to cycle counter. > > + * @cycles: Cycles > > + * > > + * XXX - This could use some mult_lxl_ll() asm optimization. Same code > > + * as in cyc2ns, but with unsigned result. > > + */ > > +static inline u64 cyclecounter_cyc2ns(const struct cyclecounter *cc, > > + cycle_t cycles) > > +{ > > + u64 ret = (u64)cycles; > > + ret = (ret * cc->mult) >> cc->shift; > > + return ret; > > +} > > This is just outright duplication.. Why wouldn't you use the function > that already exists for this? Because it only works with a clocksource. Adding a second function also allows using a saner return type (unsigned instead of signed). Removing the code duplication can be done as part of the code refactoring. But that really is beyond the scope of the patch series and shouldn't be done in the network tree. > > +/** > > + * clocksource_read_ns - get nanoseconds since last call of this function > > + * @tc: Pointer to time counter > > + * > > + * When the underlying cycle counter runs over, this will be handled > > + * correctly as long as it does not run over more than once between > > + * calls. > > + * > > + * The first call to this function for a new time counter initializes > > + * the time tracking and returns bogus results. > > + */ > > "bogus results" doesn't sound very pretty.. I can call it "undefined" if that sounds better ;-) When you quoted the comment I noticed that the function name was still the old one - fixed. -- Best Regards, Patrick Ohly The content of this message is my personal opinion only and although I am an employee of Intel, the statements I make here in no way represent Intel's position on the issue, nor am I authorized to speak on behalf of Intel on this matter. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH NET-NEXT 01/10] clocksource: allow usage independent of timekeeping.c 2009-02-04 14:46 ` Patrick Ohly @ 2009-02-04 15:09 ` Daniel Walker 2009-02-04 15:24 ` Patrick Ohly 2009-02-04 19:25 ` john stultz 0 siblings, 2 replies; 23+ messages in thread From: Daniel Walker @ 2009-02-04 15:09 UTC (permalink / raw) To: Patrick Ohly Cc: linux-kernel@vger.kernel.org, netdev@vger.kernel.org, David Miller, John Stultz, Thomas Gleixner On Wed, 2009-02-04 at 15:46 +0100, Patrick Ohly wrote: > Hi Daniel! > > Thanks for looking at this. I think the previous discussion of this > patch under the same subject on LKML already addressed your concerns, > but I'll also reply in detail below. > > On Wed, 2009-02-04 at 14:03 +0000, Daniel Walker wrote: > > On Wed, 2009-02-04 at 14:01 +0100, Patrick Ohly wrote: > > > > > /** > > > + * struct cyclecounter - hardware abstraction for a free running counter > > > + * Provides completely state-free accessors to the underlying hardware. > > > + * Depending on which hardware it reads, the cycle counter may wrap > > > + * around quickly. Locking rules (if necessary) have to be defined > > > + * by the implementor and user of specific instances of this API. > > > + * > > > + * @read: returns the current cycle value > > > + * @mask: bitmask for two's complement > > > + * subtraction of non 64 bit counters, > > > + * see CLOCKSOURCE_MASK() helper macro > > > + * @mult: cycle to nanosecond multiplier > > > + * @shift: cycle to nanosecond divisor (power of two) > > > + */ > > > +struct cyclecounter { > > > + cycle_t (*read)(const struct cyclecounter *cc); > > > + cycle_t mask; > > > + u32 mult; > > > + u32 shift; > > > +}; > > > > Where are these defined? I don't see any in created in your code. > > What do you mean with "these"? cycle_t? That type is defined in > clocksource.h. This patch intentionally adds these definitions to the > existing file because of the large conceptual overlap. No, your creating a new structure here that wasn't declared. I was referring to "struct cyclecounter". I did look up one of your prior submission (Dec. 15) and reviewed that. > In an earlier revision of the patch I had adapted clocksource itself so > that it could be used outside of the time keeping code; John wanted me > to use these smaller structs instead that you now find in the current > patch. Well, I think your original idea was better.. I don't think we need the duplication of underlying clocksource mechanics. > Eventually John wants to refactor clocksource so that it uses them and > just adds additional elements in clocksource. Right now clocksource is a > mixture of different concepts. Breaking out cyclecounter and timecounter > is a first step towards that cleanup. The problem I see is that your putting off the cleanup of struct clocksource with duplication.. It should go in reverse , you should use clocksources for your patch set. Which will motivate John to clean up the clocksource structure. There's a whole other issue which is that we have many architectures already declaring struct clocksource for it's most basic usage. So I think we want clocksource to be the base "cycle counter" structure (in name and usage). Almost everything else in struct clocksource is specific to generic timekeeping and could be factored out easily. Daniel ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH NET-NEXT 01/10] clocksource: allow usage independent of timekeeping.c 2009-02-04 15:09 ` Daniel Walker @ 2009-02-04 15:24 ` Patrick Ohly 2009-02-04 19:25 ` john stultz 1 sibling, 0 replies; 23+ messages in thread From: Patrick Ohly @ 2009-02-04 15:24 UTC (permalink / raw) To: Daniel Walker Cc: linux-kernel@vger.kernel.org, netdev@vger.kernel.org, David Miller, John Stultz, Thomas Gleixner On Wed, 2009-02-04 at 15:09 +0000, Daniel Walker wrote: > On Wed, 2009-02-04 at 15:46 +0100, Patrick Ohly wrote: > > On Wed, 2009-02-04 at 14:03 +0000, Daniel Walker wrote: > > > On Wed, 2009-02-04 at 14:01 +0100, Patrick Ohly wrote: > > > > > > > /** > > > > + * struct cyclecounter - hardware abstraction for a free running counter > > > > + * Provides completely state-free accessors to the underlying hardware. > > > > + * Depending on which hardware it reads, the cycle counter may wrap > > > > + * around quickly. Locking rules (if necessary) have to be defined > > > > + * by the implementor and user of specific instances of this API. > > > > + * > > > > + * @read: returns the current cycle value > > > > + * @mask: bitmask for two's complement > > > > + * subtraction of non 64 bit counters, > > > > + * see CLOCKSOURCE_MASK() helper macro > > > > + * @mult: cycle to nanosecond multiplier > > > > + * @shift: cycle to nanosecond divisor (power of two) > > > > + */ > > > > +struct cyclecounter { > > > > + cycle_t (*read)(const struct cyclecounter *cc); > > > > + cycle_t mask; > > > > + u32 mult; > > > > + u32 shift; > > > > +}; > > > > > > Where are these defined? I don't see any in created in your code. > > > > What do you mean with "these"? cycle_t? That type is defined in > > clocksource.h. This patch intentionally adds these definitions to the > > existing file because of the large conceptual overlap. > > No, your creating a new structure here that wasn't declared. I was > referring to "struct cyclecounter". Sorry, I still don't see the problem. I'm declaring and defining struct cyclecounter here. Why should it also be declared (= "struct cyclecounter;") elsewhere? > I did look up one of your prior > submission (Dec. 15) and reviewed that. Right, I saw that a bit later. > > In an earlier revision of the patch I had adapted clocksource itself so > > that it could be used outside of the time keeping code; John wanted me > > to use these smaller structs instead that you now find in the current > > patch. > > Well, I think your original idea was better.. I don't think we need the > duplication of underlying clocksource mechanics. Please discuss this with John. I'd prefer to not get caught in the cross-fire of a discussion that I don't know enough about :-/ -- Best Regards, Patrick Ohly The content of this message is my personal opinion only and although I am an employee of Intel, the statements I make here in no way represent Intel's position on the issue, nor am I authorized to speak on behalf of Intel on this matter. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH NET-NEXT 01/10] clocksource: allow usage independent of timekeeping.c 2009-02-04 15:09 ` Daniel Walker 2009-02-04 15:24 ` Patrick Ohly @ 2009-02-04 19:25 ` john stultz 2009-02-04 19:40 ` Daniel Walker 1 sibling, 1 reply; 23+ messages in thread From: john stultz @ 2009-02-04 19:25 UTC (permalink / raw) To: Daniel Walker Cc: Patrick Ohly, linux-kernel@vger.kernel.org, netdev@vger.kernel.org, David Miller, Thomas Gleixner On Wed, 2009-02-04 at 07:09 -0800, Daniel Walker wrote: > On Wed, 2009-02-04 at 15:46 +0100, Patrick Ohly wrote: > > In an earlier revision of the patch I had adapted clocksource itself so > > that it could be used outside of the time keeping code; John wanted me > > to use these smaller structs instead that you now find in the current > > patch. > > Well, I think your original idea was better.. I don't think we need the > duplication of underlying clocksource mechanics. > > > Eventually John wants to refactor clocksource so that it uses them and > > just adds additional elements in clocksource. Right now clocksource is a > > mixture of different concepts. Breaking out cyclecounter and timecounter > > is a first step towards that cleanup. > > The problem I see is that your putting off the cleanup of struct > clocksource with duplication.. It should go in reverse , you should use > clocksources for your patch set. Which will motivate John to clean up > the clocksource structure. I strongly disagree. Misusing a established structure for unintended use is just bad. What Patrick wants to use the counters for has very different semantics then how clocksources are used. I think having a bit of redundancy in two structures is good motivation for me to clean up the clocksources to use cyclecounters. thanks -john ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH NET-NEXT 01/10] clocksource: allow usage independent of timekeeping.c 2009-02-04 19:25 ` john stultz @ 2009-02-04 19:40 ` Daniel Walker 2009-02-04 20:06 ` john stultz 0 siblings, 1 reply; 23+ messages in thread From: Daniel Walker @ 2009-02-04 19:40 UTC (permalink / raw) To: john stultz Cc: Patrick Ohly, linux-kernel@vger.kernel.org, netdev@vger.kernel.org, David Miller, Thomas Gleixner On Wed, 2009-02-04 at 11:25 -0800, john stultz wrote: > On Wed, 2009-02-04 at 07:09 -0800, Daniel Walker wrote: > > On Wed, 2009-02-04 at 15:46 +0100, Patrick Ohly wrote: > > > In an earlier revision of the patch I had adapted clocksource itself so > > > that it could be used outside of the time keeping code; John wanted me > > > to use these smaller structs instead that you now find in the current > > > patch. > > > > Well, I think your original idea was better.. I don't think we need the > > duplication of underlying clocksource mechanics. > > > > > Eventually John wants to refactor clocksource so that it uses them and > > > just adds additional elements in clocksource. Right now clocksource is a > > > mixture of different concepts. Breaking out cyclecounter and timecounter > > > is a first step towards that cleanup. > > > > The problem I see is that your putting off the cleanup of struct > > clocksource with duplication.. It should go in reverse , you should use > > clocksources for your patch set. Which will motivate John to clean up > > the clocksource structure. > > I strongly disagree. Misusing a established structure for unintended use > is just bad. What Patrick wants to use the counters for has very > different semantics then how clocksources are used. I don't think it's at all bad to reuse an established system like that, especially when the purposed one is largely duplication of the other.. The issue is more that struct clocksource has been loaded up with too many timekeeping specific ornaments .. I would think a good clean up would be just to remove those from the structure. > I think having a bit of redundancy in two structures is good motivation > for me to clean up the clocksources to use cyclecounters. I'm not sure what kind of clean up you have in mind. Given that clocksources are used across many architectures, do you really want to do a mass name change for all of them? Not to mention most people see clocksources as cycle counters , so you would just add confusion in addition to code flux. If your going to end up merging this new cycle counter structure with clocksources anyway, why wouldn't we do it now instead of doing temporary duplication .. Daniel ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH NET-NEXT 01/10] clocksource: allow usage independent of timekeeping.c 2009-02-04 19:40 ` Daniel Walker @ 2009-02-04 20:06 ` john stultz 2009-02-04 21:04 ` Daniel Walker 0 siblings, 1 reply; 23+ messages in thread From: john stultz @ 2009-02-04 20:06 UTC (permalink / raw) To: Daniel Walker Cc: Patrick Ohly, linux-kernel@vger.kernel.org, netdev@vger.kernel.org, David Miller, Thomas Gleixner On Wed, 2009-02-04 at 11:40 -0800, Daniel Walker wrote: > On Wed, 2009-02-04 at 11:25 -0800, john stultz wrote: > > On Wed, 2009-02-04 at 07:09 -0800, Daniel Walker wrote: > > > On Wed, 2009-02-04 at 15:46 +0100, Patrick Ohly wrote: > > > > In an earlier revision of the patch I had adapted clocksource itself so > > > > that it could be used outside of the time keeping code; John wanted me > > > > to use these smaller structs instead that you now find in the current > > > > patch. > > > > > > Well, I think your original idea was better.. I don't think we need the > > > duplication of underlying clocksource mechanics. > > > > > > > Eventually John wants to refactor clocksource so that it uses them and > > > > just adds additional elements in clocksource. Right now clocksource is a > > > > mixture of different concepts. Breaking out cyclecounter and timecounter > > > > is a first step towards that cleanup. > > > > > > The problem I see is that your putting off the cleanup of struct > > > clocksource with duplication.. It should go in reverse , you should use > > > clocksources for your patch set. Which will motivate John to clean up > > > the clocksource structure. > > > > I strongly disagree. Misusing a established structure for unintended use > > is just bad. What Patrick wants to use the counters for has very > > different semantics then how clocksources are used. > > I don't think it's at all bad to reuse an established system like that, > especially when the purposed one is largely duplication of the other.. The duplication is only at a very low level. He could not reuse the established clocksource system without really breaking its semantics. > The issue is more that struct clocksource has been loaded up with too > many timekeeping specific ornaments .. I would think a good clean up > would be just to remove those from the structure. I do agree. And I gave it a quick attempt and its not quite trivial. There is some timekeeping ornaments that are required in the clocksource (the two notions of mult, the flags values, etc). Trying to best split this out I think will take some care. > > I think having a bit of redundancy in two structures is good motivation > > for me to clean up the clocksources to use cyclecounters. > > I'm not sure what kind of clean up you have in mind. Given that > clocksources are used across many architectures, do you really want to > do a mass name change for all of them? Not to mention most people see > clocksources as cycle counters , so you would just add confusion in > addition to code flux. No matter what we do, we will have to touch the arch code in some way. Right now I suspect the arch code will still define clocksources, but we can refactor clocksources to use cyclecounters internally. > If your going to end up merging this new cycle counter structure with > clocksources anyway, why wouldn't we do it now instead of doing > temporary duplication .. Patrick could have created his own NICcounter infrastructure somewhere far away from the generic time code and no one would gripe (much like sched_clock in many cases). I think he's doing the right thing by establishing a base structure that we can reuse in the clocksource/timekeeping code. Doing that conversion well so the code is more readable as well as removing structure duplication will take some time and will touch a lot of historically delicate time code, so it will need additional time to sit in testing trees before we push it upward. I see no reason to block his code in the meantime. thanks -john ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH NET-NEXT 01/10] clocksource: allow usage independent of timekeeping.c 2009-02-04 20:06 ` john stultz @ 2009-02-04 21:04 ` Daniel Walker 2009-02-04 21:15 ` john stultz 0 siblings, 1 reply; 23+ messages in thread From: Daniel Walker @ 2009-02-04 21:04 UTC (permalink / raw) To: john stultz Cc: Patrick Ohly, linux-kernel@vger.kernel.org, netdev@vger.kernel.org, David Miller, Thomas Gleixner On Wed, 2009-02-04 at 12:06 -0800, john stultz wrote: > The duplication is only at a very low level. He could not reuse the > established clocksource system without really breaking its semantics. He gave a link to the first version, http://kerneltrap.org/mailarchive/linux-netdev/2008/11/19/4164204 What specific semantics is he breaking there? Daniel ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH NET-NEXT 01/10] clocksource: allow usage independent of timekeeping.c 2009-02-04 21:04 ` Daniel Walker @ 2009-02-04 21:15 ` john stultz 2009-02-05 0:18 ` Daniel Walker 0 siblings, 1 reply; 23+ messages in thread From: john stultz @ 2009-02-04 21:15 UTC (permalink / raw) To: Daniel Walker Cc: Patrick Ohly, linux-kernel@vger.kernel.org, netdev@vger.kernel.org, David Miller, Thomas Gleixner On Wed, 2009-02-04 at 13:04 -0800, Daniel Walker wrote: > On Wed, 2009-02-04 at 12:06 -0800, john stultz wrote: > > > The duplication is only at a very low level. He could not reuse the > > established clocksource system without really breaking its semantics. > > He gave a link to the first version, > > http://kerneltrap.org/mailarchive/linux-netdev/2008/11/19/4164204 > > What specific semantics is he breaking there? His re-usage of cycle_last and xtime_nsec for other means then how they're defined. In that case his use of xtime_nsec doesn't even store the same unit. Plus he adds other accessors to the clocksource structure that are not compatible with the clocksources registered for timekeeping. He's really doing something different here, and while it does access a counter, and it does translate that into nanoseconds, its not the same as whats done in the timekeeping core which the clocksource was designed around. So by creating his own infrastructure in a shared manner, splitting out a chunk of it to be reused in the clocksource/timekeeping core I think is really a good thing and the right approach. thanks -john ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH NET-NEXT 01/10] clocksource: allow usage independent of timekeeping.c 2009-02-04 21:15 ` john stultz @ 2009-02-05 0:18 ` Daniel Walker 2009-02-05 10:21 ` Patrick Ohly 0 siblings, 1 reply; 23+ messages in thread From: Daniel Walker @ 2009-02-05 0:18 UTC (permalink / raw) To: john stultz Cc: Patrick Ohly, linux-kernel@vger.kernel.org, netdev@vger.kernel.org, David Miller, Thomas Gleixner On Wed, 2009-02-04 at 13:15 -0800, john stultz wrote: > On Wed, 2009-02-04 at 13:04 -0800, Daniel Walker wrote: > > On Wed, 2009-02-04 at 12:06 -0800, john stultz wrote: > > > > > The duplication is only at a very low level. He could not reuse the > > > established clocksource system without really breaking its semantics. > > > > He gave a link to the first version, > > > > http://kerneltrap.org/mailarchive/linux-netdev/2008/11/19/4164204 > > > > What specific semantics is he breaking there? > > His re-usage of cycle_last and xtime_nsec for other means then how > they're defined. > > In that case his use of xtime_nsec doesn't even store the same unit. Those values to me are timekeeping ornaments .. It's not part of the clocksource it's stuff you added from timekeeping. He could easily add his own ornaments to the clocksource instead of re-use , then they could be merged later .. It's not perfect, but it's at least a start. > Plus he adds other accessors to the clocksource structure that are not > compatible with the clocksources registered for timekeeping. It's akin to vread, I think. I'd prefer to see the read() used instead, but I can see why there could be a need for a structure getting passed. > He's really doing something different here, and while it does access a > counter, and it does translate that into nanoseconds, its not the same > as whats done in the timekeeping core which the clocksource was designed > around. I think it's different from _timekeeping_. However the clocksource isn't getting used differently. Like you said the lowlevel parts are the same, ultimately that's all the clocksource should be. It sounds like that's what you want anyway .. You can merge the lowlevel parts with different sets of ornaments, that seems fairly acceptable to me. Daniel ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH NET-NEXT 01/10] clocksource: allow usage independent of timekeeping.c 2009-02-05 0:18 ` Daniel Walker @ 2009-02-05 10:21 ` Patrick Ohly 0 siblings, 0 replies; 23+ messages in thread From: Patrick Ohly @ 2009-02-05 10:21 UTC (permalink / raw) To: Daniel Walker Cc: john stultz, linux-kernel@vger.kernel.org, netdev@vger.kernel.org, David Miller, Thomas Gleixner On Thu, 2009-02-05 at 02:18 +0200, Daniel Walker wrote: > > Plus he adds other accessors to the clocksource structure that are not > > compatible with the clocksources registered for timekeeping. > > It's akin to vread, I think. I'd prefer to see the read() used > instead, > but I can see why there could be a need for a structure getting > passed. There is indeed: the igb driver supports multiple different interfaces. Each interface has its own NIC time, so the callback needs the additional pointer to find out which NIC time it is expected to read. Regarding the rest of the discussion: I understand that it is useful to have it and perhaps get serious about refactoring clocksource, but I also hope that this will not delay including the patches. They are needed for HW time stamping now and cannot wait for the clocksource refactoring. -- Best Regards, Patrick Ohly The content of this message is my personal opinion only and although I am an employee of Intel, the statements I make here in no way represent Intel's position on the issue, nor am I authorized to speak on behalf of Intel on this matter. ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH NET-NEXT 0/10] hardware time stamping with new fields in shinfo @ 2009-02-12 14:57 Patrick Ohly 2009-02-12 15:00 ` [PATCH NET-NEXT 01/10] clocksource: allow usage independent of timekeeping.c Patrick Ohly 0 siblings, 1 reply; 23+ messages in thread From: Patrick Ohly @ 2009-02-12 14:57 UTC (permalink / raw) To: David Miller Cc: John Stultz, Ronciak, John, linux-kernel@vger.kernel.org, netdev@vger.kernel.org Hello! This revision of the patch series transports hardware time stamps from the hardware into user space via additional fields in struct skb_shared_info. It's based on net-next-2.6 as of this morning. This is the solution that emerged from the discussion of the other approaches suggested before (reuse tstamp, extend skbuff, optional structs): it has the advantage of not touching skbuff and also has the simplest implementation. The clocksource and timecompare patches have been reviewed by John Stultz. He doesn't mind merging them via the net subtree. John Ronciak reviewed the igb driver patches. He suggested to merge the patches as they; after all, PTPd already works fine. I just tested again on 32 and 64 bit x86, both with the timestamping example programs as well as with PTPd. Latest patched PTPd is here: http://github.com/pohly/ptpd/tree/master The open TODOs in the igb driver will be fixed. We think that this will be easier with the infrastructure and the driver in a regular kernel tree. David, I hope you consider the patches acceptable now. If not, then as before: please let me know what I can do to make them acceptable. -- Best Regards, Patrick Ohly The content of this message is my personal opinion only and although I am an employee of Intel, the statements I make here in no way represent Intel's position on the issue, nor am I authorized to speak on behalf of Intel on this matter. ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH NET-NEXT 01/10] clocksource: allow usage independent of timekeeping.c 2009-02-12 14:57 [PATCH NET-NEXT 0/10] hardware time stamping with new fields in shinfo Patrick Ohly @ 2009-02-12 15:00 ` Patrick Ohly 0 siblings, 0 replies; 23+ messages in thread From: Patrick Ohly @ 2009-02-12 15:00 UTC (permalink / raw) To: David Miller Cc: linux-kernel, netdev, John Stultz, John Ronciak, Patrick Ohly So far struct clocksource acted as the interface between time/timekeeping.c and hardware. This patch generalizes the concept so that a similar interface can also be used in other contexts. For that it introduces new structures and related functions *without* touching the existing struct clocksource. The reasons for adding these new structures to clocksource.[ch] are * the APIs are clearly related * struct clocksource could be cleaned up to use the new structs * avoids proliferation of files with similar names (timesource.h? timecounter.h?) As outlined in the discussion with John Stultz, this patch adds * struct cyclecounter: stateless API to hardware which counts clock cycles * struct timecounter: stateful utility code built on a cyclecounter which provides a nanosecond counter * only the function to read the nanosecond counter; deltas are used internally and not exposed to users of timecounter The code does no locking of the shared state. It must be called at least as often as the cycle counter wraps around to detect these wrap arounds. Both is the responsibility of the timecounter user. Acked-by: John Stultz <johnstul@us.ibm.com> --- include/linux/clocksource.h | 101 +++++++++++++++++++++++++++++++++++++++++++ kernel/time/clocksource.c | 76 ++++++++++++++++++++++++++++++++ 2 files changed, 177 insertions(+), 0 deletions(-) diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h index f88d32f..573819e 100644 --- a/include/linux/clocksource.h +++ b/include/linux/clocksource.h @@ -22,8 +22,109 @@ typedef u64 cycle_t; struct clocksource; /** + * struct cyclecounter - hardware abstraction for a free running counter + * Provides completely state-free accessors to the underlying hardware. + * Depending on which hardware it reads, the cycle counter may wrap + * around quickly. Locking rules (if necessary) have to be defined + * by the implementor and user of specific instances of this API. + * + * @read: returns the current cycle value + * @mask: bitmask for two's complement + * subtraction of non 64 bit counters, + * see CLOCKSOURCE_MASK() helper macro + * @mult: cycle to nanosecond multiplier + * @shift: cycle to nanosecond divisor (power of two) + */ +struct cyclecounter { + cycle_t (*read)(const struct cyclecounter *cc); + cycle_t mask; + u32 mult; + u32 shift; +}; + +/** + * struct timecounter - layer above a %struct cyclecounter which counts nanoseconds + * Contains the state needed by timecounter_read() to detect + * cycle counter wrap around. Initialize with + * timecounter_init(). Also used to convert cycle counts into the + * corresponding nanosecond counts with timecounter_cyc2time(). Users + * of this code are responsible for initializing the underlying + * cycle counter hardware, locking issues and reading the time + * more often than the cycle counter wraps around. The nanosecond + * counter will only wrap around after ~585 years. + * + * @cc: the cycle counter used by this instance + * @cycle_last: most recent cycle counter value seen by + * timecounter_read() + * @nsec: continuously increasing count + */ +struct timecounter { + const struct cyclecounter *cc; + cycle_t cycle_last; + u64 nsec; +}; + +/** + * cyclecounter_cyc2ns - converts cycle counter cycles to nanoseconds + * @tc: Pointer to cycle counter. + * @cycles: Cycles + * + * XXX - This could use some mult_lxl_ll() asm optimization. Same code + * as in cyc2ns, but with unsigned result. + */ +static inline u64 cyclecounter_cyc2ns(const struct cyclecounter *cc, + cycle_t cycles) +{ + u64 ret = (u64)cycles; + ret = (ret * cc->mult) >> cc->shift; + return ret; +} + +/** + * timecounter_init - initialize a time counter + * @tc: Pointer to time counter which is to be initialized/reset + * @cc: A cycle counter, ready to be used. + * @start_tstamp: Arbitrary initial time stamp. + * + * After this call the current cycle register (roughly) corresponds to + * the initial time stamp. Every call to timecounter_read() increments + * the time stamp counter by the number of elapsed nanoseconds. + */ +extern void timecounter_init(struct timecounter *tc, + const struct cyclecounter *cc, + u64 start_tstamp); + +/** + * timecounter_read - return nanoseconds elapsed since timecounter_init() + * plus the initial time stamp + * @tc: Pointer to time counter. + * + * In other words, keeps track of time since the same epoch as + * the function which generated the initial time stamp. + */ +extern u64 timecounter_read(struct timecounter *tc); + +/** + * timecounter_cyc2time - convert a cycle counter to same + * time base as values returned by + * timecounter_read() + * @tc: Pointer to time counter. + * @cycle: a value returned by tc->cc->read() + * + * Cycle counts that are converted correctly as long as they + * fall into the interval [-1/2 max cycle count, +1/2 max cycle count], + * with "max cycle count" == cs->mask+1. + * + * This allows conversion of cycle counter values which were generated + * in the past. + */ +extern u64 timecounter_cyc2time(struct timecounter *tc, + cycle_t cycle_tstamp); + +/** * struct clocksource - hardware abstraction for a free running counter * Provides mostly state-free accessors to the underlying hardware. + * This is the structure used for system time. * * @name: ptr to clocksource name * @list: list head for registration diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c index ca89e15..c46c931 100644 --- a/kernel/time/clocksource.c +++ b/kernel/time/clocksource.c @@ -31,6 +31,82 @@ #include <linux/sched.h> /* for spin_unlock_irq() using preempt_count() m68k */ #include <linux/tick.h> +void timecounter_init(struct timecounter *tc, + const struct cyclecounter *cc, + u64 start_tstamp) +{ + tc->cc = cc; + tc->cycle_last = cc->read(cc); + tc->nsec = start_tstamp; +} +EXPORT_SYMBOL(timecounter_init); + +/** + * timecounter_read_delta - get nanoseconds since last call of this function + * @tc: Pointer to time counter + * + * When the underlying cycle counter runs over, this will be handled + * correctly as long as it does not run over more than once between + * calls. + * + * The first call to this function for a new time counter initializes + * the time tracking and returns an undefined result. + */ +static u64 timecounter_read_delta(struct timecounter *tc) +{ + cycle_t cycle_now, cycle_delta; + u64 ns_offset; + + /* read cycle counter: */ + cycle_now = tc->cc->read(tc->cc); + + /* calculate the delta since the last timecounter_read_delta(): */ + cycle_delta = (cycle_now - tc->cycle_last) & tc->cc->mask; + + /* convert to nanoseconds: */ + ns_offset = cyclecounter_cyc2ns(tc->cc, cycle_delta); + + /* update time stamp of timecounter_read_delta() call: */ + tc->cycle_last = cycle_now; + + return ns_offset; +} + +u64 timecounter_read(struct timecounter *tc) +{ + u64 nsec; + + /* increment time by nanoseconds since last call */ + nsec = timecounter_read_delta(tc); + nsec += tc->nsec; + tc->nsec = nsec; + + return nsec; +} +EXPORT_SYMBOL(timecounter_read); + +u64 timecounter_cyc2time(struct timecounter *tc, + cycle_t cycle_tstamp) +{ + u64 cycle_delta = (cycle_tstamp - tc->cycle_last) & tc->cc->mask; + u64 nsec; + + /* + * Instead of always treating cycle_tstamp as more recent + * than tc->cycle_last, detect when it is too far in the + * future and treat it as old time stamp instead. + */ + if (cycle_delta > tc->cc->mask / 2) { + cycle_delta = (tc->cycle_last - cycle_tstamp) & tc->cc->mask; + nsec = tc->nsec - cyclecounter_cyc2ns(tc->cc, cycle_delta); + } else { + nsec = cyclecounter_cyc2ns(tc->cc, cycle_delta) + tc->nsec; + } + + return nsec; +} +EXPORT_SYMBOL(timecounter_cyc2time); + /* XXX - Would like a better way for initializing curr_clocksource */ extern struct clocksource clocksource_jiffies; -- 1.6.1.2 ^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH NET-NEXT 0/10] hardware time stamping with new fields in shinfo 2009-02-12 14:57 [PATCH NET-NEXT 0/10] hardware time stamping with new fields in shinfo Patrick Ohly @ 2009-02-12 15:03 Patrick Ohly 2009-02-12 15:03 ` [PATCH NET-NEXT 01/10] clocksource: allow usage independent of timekeeping.c Patrick Ohly 0 siblings, 1 reply; 23+ messages in thread From: Patrick Ohly @ 2009-02-12 15:03 UTC (permalink / raw) To: David Miller Cc: John Stultz, Ronciak, John, linux-kernel@vger.kernel.org, netdev@vger.kernel.org [once more, with signed-off - sorry for the noise] This revision of the patch series transports hardware time stamps from the hardware into user space via additional fields in struct skb_shared_info. It's based on net-next-2.6 as of this morning. This is the solution that emerged from the discussion of the other approaches suggested before (reuse tstamp, extend skbuff, optional structs): it has the advantage of not touching skbuff and also has the simplest implementation. The clocksource and timecompare patches have been reviewed by John Stultz. He doesn't mind merging them via the net subtree. John Ronciak reviewed the igb driver patches. He suggested to merge the patches as they; after all, PTPd already works fine. I just tested again on 32 and 64 bit x86, both with the timestamping example programs as well as with PTPd. Latest patched PTPd is here: http://github.com/pohly/ptpd/tree/master The open TODOs in the igb driver will be fixed. We think that this will be easier with the infrastructure and the driver in a regular kernel tree. David, I hope you consider the patches acceptable now. If not, then as before: please let me know what I can do to make them acceptable. ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH NET-NEXT 01/10] clocksource: allow usage independent of timekeeping.c 2009-02-12 15:03 [PATCH NET-NEXT 0/10] hardware time stamping with new fields in shinfo Patrick Ohly @ 2009-02-12 15:03 ` Patrick Ohly 0 siblings, 0 replies; 23+ messages in thread From: Patrick Ohly @ 2009-02-12 15:03 UTC (permalink / raw) To: David Miller Cc: linux-kernel, netdev, John Stultz, John Ronciak, Patrick Ohly So far struct clocksource acted as the interface between time/timekeeping.c and hardware. This patch generalizes the concept so that a similar interface can also be used in other contexts. For that it introduces new structures and related functions *without* touching the existing struct clocksource. The reasons for adding these new structures to clocksource.[ch] are * the APIs are clearly related * struct clocksource could be cleaned up to use the new structs * avoids proliferation of files with similar names (timesource.h? timecounter.h?) As outlined in the discussion with John Stultz, this patch adds * struct cyclecounter: stateless API to hardware which counts clock cycles * struct timecounter: stateful utility code built on a cyclecounter which provides a nanosecond counter * only the function to read the nanosecond counter; deltas are used internally and not exposed to users of timecounter The code does no locking of the shared state. It must be called at least as often as the cycle counter wraps around to detect these wrap arounds. Both is the responsibility of the timecounter user. Acked-by: John Stultz <johnstul@us.ibm.com> Signed-off-by: Patrick Ohly <patrick.ohly@intel.com> --- include/linux/clocksource.h | 101 +++++++++++++++++++++++++++++++++++++++++++ kernel/time/clocksource.c | 76 ++++++++++++++++++++++++++++++++ 2 files changed, 177 insertions(+), 0 deletions(-) diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h index f88d32f..573819e 100644 --- a/include/linux/clocksource.h +++ b/include/linux/clocksource.h @@ -22,8 +22,109 @@ typedef u64 cycle_t; struct clocksource; /** + * struct cyclecounter - hardware abstraction for a free running counter + * Provides completely state-free accessors to the underlying hardware. + * Depending on which hardware it reads, the cycle counter may wrap + * around quickly. Locking rules (if necessary) have to be defined + * by the implementor and user of specific instances of this API. + * + * @read: returns the current cycle value + * @mask: bitmask for two's complement + * subtraction of non 64 bit counters, + * see CLOCKSOURCE_MASK() helper macro + * @mult: cycle to nanosecond multiplier + * @shift: cycle to nanosecond divisor (power of two) + */ +struct cyclecounter { + cycle_t (*read)(const struct cyclecounter *cc); + cycle_t mask; + u32 mult; + u32 shift; +}; + +/** + * struct timecounter - layer above a %struct cyclecounter which counts nanoseconds + * Contains the state needed by timecounter_read() to detect + * cycle counter wrap around. Initialize with + * timecounter_init(). Also used to convert cycle counts into the + * corresponding nanosecond counts with timecounter_cyc2time(). Users + * of this code are responsible for initializing the underlying + * cycle counter hardware, locking issues and reading the time + * more often than the cycle counter wraps around. The nanosecond + * counter will only wrap around after ~585 years. + * + * @cc: the cycle counter used by this instance + * @cycle_last: most recent cycle counter value seen by + * timecounter_read() + * @nsec: continuously increasing count + */ +struct timecounter { + const struct cyclecounter *cc; + cycle_t cycle_last; + u64 nsec; +}; + +/** + * cyclecounter_cyc2ns - converts cycle counter cycles to nanoseconds + * @tc: Pointer to cycle counter. + * @cycles: Cycles + * + * XXX - This could use some mult_lxl_ll() asm optimization. Same code + * as in cyc2ns, but with unsigned result. + */ +static inline u64 cyclecounter_cyc2ns(const struct cyclecounter *cc, + cycle_t cycles) +{ + u64 ret = (u64)cycles; + ret = (ret * cc->mult) >> cc->shift; + return ret; +} + +/** + * timecounter_init - initialize a time counter + * @tc: Pointer to time counter which is to be initialized/reset + * @cc: A cycle counter, ready to be used. + * @start_tstamp: Arbitrary initial time stamp. + * + * After this call the current cycle register (roughly) corresponds to + * the initial time stamp. Every call to timecounter_read() increments + * the time stamp counter by the number of elapsed nanoseconds. + */ +extern void timecounter_init(struct timecounter *tc, + const struct cyclecounter *cc, + u64 start_tstamp); + +/** + * timecounter_read - return nanoseconds elapsed since timecounter_init() + * plus the initial time stamp + * @tc: Pointer to time counter. + * + * In other words, keeps track of time since the same epoch as + * the function which generated the initial time stamp. + */ +extern u64 timecounter_read(struct timecounter *tc); + +/** + * timecounter_cyc2time - convert a cycle counter to same + * time base as values returned by + * timecounter_read() + * @tc: Pointer to time counter. + * @cycle: a value returned by tc->cc->read() + * + * Cycle counts that are converted correctly as long as they + * fall into the interval [-1/2 max cycle count, +1/2 max cycle count], + * with "max cycle count" == cs->mask+1. + * + * This allows conversion of cycle counter values which were generated + * in the past. + */ +extern u64 timecounter_cyc2time(struct timecounter *tc, + cycle_t cycle_tstamp); + +/** * struct clocksource - hardware abstraction for a free running counter * Provides mostly state-free accessors to the underlying hardware. + * This is the structure used for system time. * * @name: ptr to clocksource name * @list: list head for registration diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c index ca89e15..c46c931 100644 --- a/kernel/time/clocksource.c +++ b/kernel/time/clocksource.c @@ -31,6 +31,82 @@ #include <linux/sched.h> /* for spin_unlock_irq() using preempt_count() m68k */ #include <linux/tick.h> +void timecounter_init(struct timecounter *tc, + const struct cyclecounter *cc, + u64 start_tstamp) +{ + tc->cc = cc; + tc->cycle_last = cc->read(cc); + tc->nsec = start_tstamp; +} +EXPORT_SYMBOL(timecounter_init); + +/** + * timecounter_read_delta - get nanoseconds since last call of this function + * @tc: Pointer to time counter + * + * When the underlying cycle counter runs over, this will be handled + * correctly as long as it does not run over more than once between + * calls. + * + * The first call to this function for a new time counter initializes + * the time tracking and returns an undefined result. + */ +static u64 timecounter_read_delta(struct timecounter *tc) +{ + cycle_t cycle_now, cycle_delta; + u64 ns_offset; + + /* read cycle counter: */ + cycle_now = tc->cc->read(tc->cc); + + /* calculate the delta since the last timecounter_read_delta(): */ + cycle_delta = (cycle_now - tc->cycle_last) & tc->cc->mask; + + /* convert to nanoseconds: */ + ns_offset = cyclecounter_cyc2ns(tc->cc, cycle_delta); + + /* update time stamp of timecounter_read_delta() call: */ + tc->cycle_last = cycle_now; + + return ns_offset; +} + +u64 timecounter_read(struct timecounter *tc) +{ + u64 nsec; + + /* increment time by nanoseconds since last call */ + nsec = timecounter_read_delta(tc); + nsec += tc->nsec; + tc->nsec = nsec; + + return nsec; +} +EXPORT_SYMBOL(timecounter_read); + +u64 timecounter_cyc2time(struct timecounter *tc, + cycle_t cycle_tstamp) +{ + u64 cycle_delta = (cycle_tstamp - tc->cycle_last) & tc->cc->mask; + u64 nsec; + + /* + * Instead of always treating cycle_tstamp as more recent + * than tc->cycle_last, detect when it is too far in the + * future and treat it as old time stamp instead. + */ + if (cycle_delta > tc->cc->mask / 2) { + cycle_delta = (tc->cycle_last - cycle_tstamp) & tc->cc->mask; + nsec = tc->nsec - cyclecounter_cyc2ns(tc->cc, cycle_delta); + } else { + nsec = cyclecounter_cyc2ns(tc->cc, cycle_delta) + tc->nsec; + } + + return nsec; +} +EXPORT_SYMBOL(timecounter_cyc2time); + /* XXX - Would like a better way for initializing curr_clocksource */ extern struct clocksource clocksource_jiffies; -- 1.6.1.2 ^ permalink raw reply related [flat|nested] 23+ messages in thread
end of thread, other threads:[~2009-02-12 15:03 UTC | newest] Thread overview: 23+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-02-04 13:01 clock synchronization utility code Patrick Ohly 2009-02-04 13:01 ` [PATCH NET-NEXT 01/10] clocksource: allow usage independent of timekeeping.c Patrick Ohly 2009-02-04 13:01 ` [PATCH NET-NEXT 02/10] time sync: generic infrastructure to map between time stamps generated by a time counter and system time Patrick Ohly 2009-02-04 19:44 ` john stultz 2009-02-05 10:21 ` Patrick Ohly 2009-02-09 17:02 ` Patrick Ohly 2009-02-09 19:27 ` John Stultz 2009-02-09 21:46 ` Patrick Ohly 2009-02-09 21:54 ` John Stultz 2009-02-09 22:57 ` David Miller 2009-02-04 14:03 ` [PATCH NET-NEXT 01/10] clocksource: allow usage independent of timekeeping.c Daniel Walker 2009-02-04 14:46 ` Patrick Ohly 2009-02-04 15:09 ` Daniel Walker 2009-02-04 15:24 ` Patrick Ohly 2009-02-04 19:25 ` john stultz 2009-02-04 19:40 ` Daniel Walker 2009-02-04 20:06 ` john stultz 2009-02-04 21:04 ` Daniel Walker 2009-02-04 21:15 ` john stultz 2009-02-05 0:18 ` Daniel Walker 2009-02-05 10:21 ` Patrick Ohly -- strict thread matches above, loose matches on Subject: below -- 2009-02-12 14:57 [PATCH NET-NEXT 0/10] hardware time stamping with new fields in shinfo Patrick Ohly 2009-02-12 15:00 ` [PATCH NET-NEXT 01/10] clocksource: allow usage independent of timekeeping.c Patrick Ohly 2009-02-12 15:03 [PATCH NET-NEXT 0/10] hardware time stamping with new fields in shinfo Patrick Ohly 2009-02-12 15:03 ` [PATCH NET-NEXT 01/10] clocksource: allow usage independent of timekeeping.c Patrick Ohly
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).