* [Xenomai-core] official TSC model on SMP
@ 2006-09-17 11:54 Jan Kiszka
2006-09-17 17:15 ` Philippe Gerum
0 siblings, 1 reply; 6+ messages in thread
From: Jan Kiszka @ 2006-09-17 11:54 UTC (permalink / raw)
To: xenomai-core
[-- Attachment #1: Type: text/plain, Size: 659 bytes --]
Hi,
reading through timer code of Xenomai I just wondered (again) what our
official model of TSCs on multiprocessor boxes are:
1) (practically) perfectly synchronised without offset
2) synchronised but with (unknown?) offset
3) unsynchronised
I'm asking because I worried about timestamps taken on external events
like interrupts on one CPU and are then used to trigger some timed
operation on another. Such things may easily happen via RTDM devices
where we do not communicate the CPU source of event timestamps. But
there is also other code influenced by the TSC model, e.g. data
collection and evaluation for CPU load stats.
Jan
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 249 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Xenomai-core] official TSC model on SMP
2006-09-17 11:54 [Xenomai-core] official TSC model on SMP Jan Kiszka
@ 2006-09-17 17:15 ` Philippe Gerum
2006-09-17 17:36 ` Gilles Chanteperdrix
0 siblings, 1 reply; 6+ messages in thread
From: Philippe Gerum @ 2006-09-17 17:15 UTC (permalink / raw)
To: Jan Kiszka; +Cc: xenomai-core
On Sun, 2006-09-17 at 13:54 +0200, Jan Kiszka wrote:
> Hi,
>
> reading through timer code of Xenomai I just wondered (again) what our
> official model of TSCs on multiprocessor boxes are:
>
> 1) (practically) perfectly synchronised without offset
>
> 2) synchronised but with (unknown?) offset
>
> 3) unsynchronised
The current model is unsynchronized. If anything from the codebase is
found relying on the opposite, be it partially or fully, then it's
utterly broken in the SMP case, and I'm likely the one to blame.
IOW, we don't currently provide any guarantee to anyone that a timestamp
could be interpreted anywhere else than the CPU it was read from, and
leave all the related burden to the application developer (e.g. by mean
of managing CPU affinity constraints and the like).
>
> I'm asking because I worried about timestamps taken on external events
> like interrupts on one CPU and are then used to trigger some timed
> operation on another. Such things may easily happen via RTDM devices
> where we do not communicate the CPU source of event timestamps. But
> there is also other code influenced by the TSC model, e.g. data
> collection and evaluation for CPU load stats.
>
> Jan
>
> _______________________________________________
> Xenomai-core mailing list
> Xenomai-core@domain.hid
> https://mail.gna.org/listinfo/xenomai-core
--
Philippe.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Xenomai-core] official TSC model on SMP
2006-09-17 17:15 ` Philippe Gerum
@ 2006-09-17 17:36 ` Gilles Chanteperdrix
2006-09-18 9:43 ` Jan Kiszka
2006-09-19 21:46 ` Philippe Gerum
0 siblings, 2 replies; 6+ messages in thread
From: Gilles Chanteperdrix @ 2006-09-17 17:36 UTC (permalink / raw)
To: rpm; +Cc: Jan Kiszka, xenomai-core
Philippe Gerum wrote:
> On Sun, 2006-09-17 at 13:54 +0200, Jan Kiszka wrote:
> > Hi,
> >
> > reading through timer code of Xenomai I just wondered (again) what our
> > official model of TSCs on multiprocessor boxes are:
> >
> > 1) (practically) perfectly synchronised without offset
> >
> > 2) synchronised but with (unknown?) offset
> >
> > 3) unsynchronised
>
> The current model is unsynchronized. If anything from the codebase is
> found relying on the opposite, be it partially or fully, then it's
> utterly broken in the SMP case, and I'm likely the one to blame.
> IOW, we don't currently provide any guarantee to anyone that a timestamp
> could be interpreted anywhere else than the CPU it was read from, and
> leave all the related burden to the application developer (e.g. by mean
> of managing CPU affinity constraints and the like).
I think xntimer_set_sched assumes some continuity, since it reenqueues
on a different CPU a timer whose date has been set up on one cpu. If we
wanted strict unsynchronized tsc, we would have to :
- compute the difference between the current date and the timer date
- enqueue the timer on a migration queue
- send an IPI to the distant CPU
- on the distant CPU, iterate over the migration queue, recompute the
expiration dates and reenqueue the timers.
--
Gilles Chanteperdrix.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Xenomai-core] official TSC model on SMP
2006-09-17 17:36 ` Gilles Chanteperdrix
@ 2006-09-18 9:43 ` Jan Kiszka
2006-09-19 21:46 ` Philippe Gerum
1 sibling, 0 replies; 6+ messages in thread
From: Jan Kiszka @ 2006-09-18 9:43 UTC (permalink / raw)
To: Gilles Chanteperdrix; +Cc: xenomai-core
[-- Attachment #1: Type: text/plain, Size: 2210 bytes --]
Gilles Chanteperdrix wrote:
> Philippe Gerum wrote:
> > On Sun, 2006-09-17 at 13:54 +0200, Jan Kiszka wrote:
> > > Hi,
> > >
> > > reading through timer code of Xenomai I just wondered (again) what our
> > > official model of TSCs on multiprocessor boxes are:
> > >
> > > 1) (practically) perfectly synchronised without offset
> > >
> > > 2) synchronised but with (unknown?) offset
> > >
> > > 3) unsynchronised
> >
> > The current model is unsynchronized. If anything from the codebase is
> > found relying on the opposite, be it partially or fully, then it's
> > utterly broken in the SMP case, and I'm likely the one to blame.
> > IOW, we don't currently provide any guarantee to anyone that a timestamp
> > could be interpreted anywhere else than the CPU it was read from, and
> > leave all the related burden to the application developer (e.g. by mean
> > of managing CPU affinity constraints and the like).
>
> I think xntimer_set_sched assumes some continuity, since it reenqueues
> on a different CPU a timer whose date has been set up on one cpu. If we
> wanted strict unsynchronized tsc, we would have to :
> - compute the difference between the current date and the timer date
> - enqueue the timer on a migration queue
> - send an IPI to the distant CPU
> - on the distant CPU, iterate over the migration queue, recompute the
> expiration dates and reenqueue the timers.
>
I bet there are a few other wrong assumptions about dates hogging around
in our design. At least all timestamp deliveries of RTDM devices on SMP
fall into this group.
It's a fairly unsatisfying situation. How common are unsynchronised
TSCs, also on non x86 archs? I though so far that's an issue of a few
newer AMD multicore CPUs only. I once read the Solaris x86 timer code,
and they, e.g., demand synch'ed TSC IIRC.
Can we make the unsynch'ed case somehow the exception and deal with it
separately (via periodic re-synch based on IPI)? Would make life easier,
though I agree that explicit CPU scheduling of tasks and IRQs are more
appropriate to RT systems. But sometimes you cannot put the IRQ on the
same CPU like the woken up task...
Jan
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 250 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Xenomai-core] official TSC model on SMP
2006-09-17 17:36 ` Gilles Chanteperdrix
2006-09-18 9:43 ` Jan Kiszka
@ 2006-09-19 21:46 ` Philippe Gerum
2006-09-20 8:17 ` Philippe Gerum
1 sibling, 1 reply; 6+ messages in thread
From: Philippe Gerum @ 2006-09-19 21:46 UTC (permalink / raw)
To: Gilles Chanteperdrix; +Cc: Jan Kiszka, xenomai-core
On Sun, 2006-09-17 at 19:36 +0200, Gilles Chanteperdrix wrote:
> Philippe Gerum wrote:
> > On Sun, 2006-09-17 at 13:54 +0200, Jan Kiszka wrote:
> > > Hi,
> > >
> > > reading through timer code of Xenomai I just wondered (again) what our
> > > official model of TSCs on multiprocessor boxes are:
> > >
> > > 1) (practically) perfectly synchronised without offset
> > >
> > > 2) synchronised but with (unknown?) offset
> > >
> > > 3) unsynchronised
> >
> > The current model is unsynchronized. If anything from the codebase is
> > found relying on the opposite, be it partially or fully, then it's
> > utterly broken in the SMP case, and I'm likely the one to blame.
> > IOW, we don't currently provide any guarantee to anyone that a timestamp
> > could be interpreted anywhere else than the CPU it was read from, and
> > leave all the related burden to the application developer (e.g. by mean
> > of managing CPU affinity constraints and the like).
>
> I think xntimer_set_sched assumes some continuity, since it reenqueues
> on a different CPU a timer whose date has been set up on one cpu. If we
> wanted strict unsynchronized tsc, we would have to :
> - compute the difference between the current date and the timer date
> - enqueue the timer on a migration queue
> - send an IPI to the distant CPU
> - on the distant CPU, iterate over the migration queue, recompute the
> expiration dates and reenqueue the timers.
>
Looking at the code, xntimer_set_sched is used in the following
contexts:
o setting a CPU affinity for timers just before they get started. They
might be active already but in such a case, we could simply disarm them
before migration since they are going to be started anew right after,
which would solve the issue. AFAIC, this is the most common case.
o changing a CPU affinity for a possibly running (periodic internal)
timer, without restarting it (i.e. xnpod_migrate_thread). This happens
once.
So, basically, the issue is about allowing threads that run a periodic
timer to migrate on the fly, without breaking their ongoing
timeline, ... or not. (And all of the above only applies to the oneshot
system timing, since the periodic jiffy counter is global by nature, and
would never suffer any discontinuity).
--
Philippe.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Xenomai-core] official TSC model on SMP
2006-09-19 21:46 ` Philippe Gerum
@ 2006-09-20 8:17 ` Philippe Gerum
0 siblings, 0 replies; 6+ messages in thread
From: Philippe Gerum @ 2006-09-20 8:17 UTC (permalink / raw)
To: Gilles Chanteperdrix; +Cc: Jan Kiszka, xenomai-core
On Tue, 2006-09-19 at 23:46 +0200, Philippe Gerum wrote:
> On Sun, 2006-09-17 at 19:36 +0200, Gilles Chanteperdrix wrote:
> > Philippe Gerum wrote:
> > > On Sun, 2006-09-17 at 13:54 +0200, Jan Kiszka wrote:
> > > > Hi,
> > > >
> > > > reading through timer code of Xenomai I just wondered (again) what our
> > > > official model of TSCs on multiprocessor boxes are:
> > > >
> > > > 1) (practically) perfectly synchronised without offset
> > > >
> > > > 2) synchronised but with (unknown?) offset
> > > >
> > > > 3) unsynchronised
> > >
> > > The current model is unsynchronized. If anything from the codebase is
> > > found relying on the opposite, be it partially or fully, then it's
> > > utterly broken in the SMP case, and I'm likely the one to blame.
> > > IOW, we don't currently provide any guarantee to anyone that a timestamp
> > > could be interpreted anywhere else than the CPU it was read from, and
> > > leave all the related burden to the application developer (e.g. by mean
> > > of managing CPU affinity constraints and the like).
> >
> > I think xntimer_set_sched assumes some continuity, since it reenqueues
> > on a different CPU a timer whose date has been set up on one cpu. If we
> > wanted strict unsynchronized tsc, we would have to :
> > - compute the difference between the current date and the timer date
> > - enqueue the timer on a migration queue
> > - send an IPI to the distant CPU
> > - on the distant CPU, iterate over the migration queue, recompute the
> > expiration dates and reenqueue the timers.
> >
>
> Looking at the code, xntimer_set_sched is used in the following
> contexts:
>
> o setting a CPU affinity for timers just before they get started. They
> might be active already but in such a case, we could simply disarm them
> before migration since they are going to be started anew right after,
> which would solve the issue. AFAIC, this is the most common case.
>
> o changing a CPU affinity for a possibly running (periodic internal)
> timer, without restarting it (i.e. xnpod_migrate_thread). This happens
> once.
>
> So, basically, the issue is about allowing threads that run a periodic
> timer to migrate on the fly, without breaking their ongoing
> timeline, ... or not. (And all of the above only applies to the oneshot
> system timing, since the periodic jiffy counter is global by nature, and
> would never suffer any discontinuity).
>
To elaborate a bit more on this, and considering that only stopped
timers are migrated, we would indeed need a timer migration queue on the
remote CPU, but we might assume that the date field of the linked timers
would convey the original date, and not the TSC converted one or delay
from such, and we could safely assume that such timers would not be
indexed by any data structures that usually hold the outstanding timers.
Simply reusing the sched IPI for kicking the remote CPU to fire up the
migrated timers would even be an option.
--
Philippe.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2006-09-20 8:17 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-09-17 11:54 [Xenomai-core] official TSC model on SMP Jan Kiszka
2006-09-17 17:15 ` Philippe Gerum
2006-09-17 17:36 ` Gilles Chanteperdrix
2006-09-18 9:43 ` Jan Kiszka
2006-09-19 21:46 ` Philippe Gerum
2006-09-20 8:17 ` Philippe Gerum
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.