From: George Anzinger <george@mvista.com>
To: john stultz <johnstul@us.ibm.com>
Cc: Christoph Lameter <clameter@sgi.com>,
lkml <linux-kernel@vger.kernel.org>,
tim@physik3.uni-rostock.de, albert@users.sourceforge.net,
Ulrich.Windl@rz.uni-regensburg.de,
Len Brown <len.brown@intel.com>,
linux@dominikbrodowski.de, David Mosberger <davidm@hpl.hp.com>,
Andi Kleen <ak@suse.de>,
paulus@samba.org, schwidefsky@de.ibm.com, jimix@us.ibm.com,
keith maanthey <kmannth@us.ibm.com>, greg kh <greg@kroah.com>,
Patricia Gaughen <gone@us.ibm.com>,
Chris McDermott <lcm@us.ibm.com>
Subject: Re: [RFC][PATCH] new timeofday core subsystem (v.A0)
Date: Fri, 03 Sep 2004 00:43:30 -0700 [thread overview]
Message-ID: <413820A2.80409@mvista.com> (raw)
In-Reply-To: <1094175004.14662.440.camel@cog.beaverton.ibm.com>
john stultz wrote:
> On Thu, 2004-09-02 at 17:47, Christoph Lameter wrote:
>
>>On Thu, 2 Sep 2004, john stultz wrote:
>>
>>
>>>>Of course but its not a generic way of timer acccess and
>>>>requires a fastcall for each timer. You still have the problem of
>>>>exporting the frequency and the time offsets to user space (those also
>>>>need to be kept current in order for one to be able to calculate a timer
>>>>value!). The syscall/fastcall API then needs to be extended to allow for a
>>>>system call to access each of the individual timers supported.
>>>
>>>Indeed, it would require a fastcall accessor for each timesource, but
>>>maybe fastcall is the wrong way to describe it. Could it just be a
>>>function that uses the epc to escalate its privileges? As for freq and
>>>offsets, those would already be user-visible (see below for more
>>>details)
>>
>>The only way curent way to enter the kernel from glibc with a fastcall is
>>the EPC.
>
>
> Hmm. I must be explaining myself poorly, or not understanding you. I
> apologize for not understanding this EPC/fastcall business well enough.
> I'd like to use EPC from a user-executable kernel page to escalate
> privileges to access the hardware counter. I don't care if I have to use
> the the current fastcall (fsys.S) interface or not. However you're
> making sounds like this isn't possible, so I'll need to do some
> research.
>
>
>>>The plan at the moment is that the timeofday core gettimeofday code path
>>>as well as any timesource that supports it adds a _vsyscall linker
>>>attribute. Then the linker will place all the code on a special page or
>>>set of pages. Those pages would then be mapped in as user-readable. Then
>>>just like the x86-64's vsyscall gettimeofday, glibc would re-direct
>>>gettimeofday calls to the user-mapped kernel pages, where it would
>>>execute in user mode (with maybe the epc privilege escalation for ia64
>>>time sources that required it).
>>
>>The EPC call already does do a *secure* transfer like this on IA64 and
>>will execute kernel code without user space mapping. This idea raises all sorts
>>of concerns....
>
>
> Yes, but its not portable. Reducing duplicate code so timeofday
> maintenance isn't a nightmare is the first goal here. It may not be
> completely achievable, and when I hit that point I'll have to rework the
> design, but at this point I'm not convinced that it cannot be done.
>
> As for security concerns, all your mapping out to userspace are the time
> variables and the functions needed to convert those to a accurate time
> of day. This is almost what you suggest below, but with the additional
> math from the kernel. The method I'm suggesting is very similar to
> x86-64's arch/x86-64/kernel/vsyscall.c.
>
>
>
>>>I had to do most of this for the i386 vsyscall-gettimeofday port, but I
>>>was unhappy with the duplication of code (and bugs), as well as the fact
>>>that I was then being pushed to re-do it for each architecture. While
>>>its not currently implemented (I was hoping to keep the discussion
>>>focused on the correctness of what's been implemented), I feel the plan
>>>for user-mode access won't be too complex. I'm still open for further
>>>discussion if you'd like, obviously performance is quite important, and
>>>I want to calm any fears you have, but I'm sure the new ntp code plenty
>>>of performance issues to look at before we start digging into usermode
>>>access, so maybe we can come back to this discussion later?
>>
>>The functions in the timer source structure is a central problem for IA64. We
>>cannot take that out right now.
>
>
> Don't worry I'm not submitting this code just yet. :) I'll need all (or
> at least most of the important) architecture maintainers to sign on
> before I try to push this code in.
>
> Right now I'm trying to shake out possible problems with the design and
> the first pass implementation of the code. You're helping me do that,
> and I thank you for it. Concessions will be made, but for now I'm going
> to try to preserve the current design and work around the problems as
> they arise.
>
>
>
>>The full parameterization of timer access as I have suggested also allows
>>user page access to timers with simply mapping a page to access the timer.
>>No other gimmicks are needed since the timer source structure contains all
>>information to calculate a timer offset.
>>
>>
>>>Yes, but x86-64 has one way, and ia64 does it another, and i know ppc
>>>folks have talked about their own user mode time access. Chasing down a
>>>time bug across arches gets to be fairly hairy, so I'm trying to
>>>simplify that.
>>
>>The simplest thins is to provide a data structure without any functions
>>attached that can simply be copied into userspace if wanted. If an arch
>>needs special time access then that is depending on the arch specific
>>methods available and such a data structure as I have proposed will
>>include all the info necessary to implement such user mode time access.
>
>
> Ehhh.. I really don't like the idea of giving all the raw values to
> userspace and letting user-code do the timeofday calculation. Fixing
> bugs in each arches timeofday code is hard enough. Imagine if we have to
> go through and fix userspace too! It would also make a user/kernel data
> interface that we'd have to preserve. I'd like to avoid that and instead
> use the vsyscall method to give us greater flexibility. Plus I doubt
> anyone would want to implement the NTP adjustments in userspace? eek!
>
>
>>A requirement to call functions in the kernel to determine time makes
>>these solution impossible. And its getting extremely complex if one has to
>>invoke different functions for each timer supported.
>
>
> The struct timesource interface of read()/delta()/cyc2ns() was the best
> generalization I could come up with. I feel they're necessary for the
> following reasons:
>
> cyc2ns(): In this conversion we can optimize the math depending on the
> timesource. If the timesource freq is a power of 2, we can just use
> shift! However if its a weird value and we have to be very precise, we
> do a full 64bit divide. We're not stuck ith one equation given a freq
> value.
>
> delta(): Some counters don't fill 32 or 64 bits. ACPI PM time source is
> 24 bits, and the cyclone is 40. Thus to do proper twos complement
> subtraction without overflow worries you need to mask the subtraction.
> This can be done by exporting a mask value w/ the freq value, but was
> cleaner when moved into the timesource.
>
> read(): Rather then just giving the address of the register, the read
> call allows for timesource specific logic. This lets us use jiffies as a
> timesource, or in cases like the ACPI PM timesource, where the register
> must be read 3 times in order to ensure a correct value is latched, we
> can avoid having to include that logic into the generic code, so it does
> not affect systems that do not use or have that timesource.
I am not convince that 3 reads are in fact needed. In fact, I am amost certain
that two is more than enough. In fact, it takes so long to read it that I just
use one read and a sanity check in the HRT code. Here is the code I use:
unsigned long
quick_get_cpuctr(void)
{
static unsigned long last_read = 0;
static int qgc_max = 0;
int i;
unsigned long rd_delta, rd_ans, rd = inl(acpi_pm_tmr_address);
/*
* This will be REALLY big if ever we move backward in time...
*/
rd_delta = (rd - last_read) & SIZE_MASK;
last_read = rd;
rd_ans = (rd - last_update) & SIZE_MASK;
if (likely((rd_ans < (arch_cycles_per_jiffy << 1)) &&
(rd_delta < (arch_cycles_per_jiffy << 1))))
return rd_ans;
for (i = 0; i < 10; i++) {
rd = inl(acpi_pm_tmr_address);
rd_delta = (rd - last_read) & SIZE_MASK;
last_read = rd;
if (unlikely(i > qgc_max))
qgc_max = i;
/*
* On my test machine (800MHZ dual PIII) this is always
* seven. Seems long, but we will give it some slack...
* We note that rd_delta (and all the vars) unsigned so
* a backward movement will show as a really big number.
*/
if (likely(rd_delta < 20))
return (rd - last_update) & SIZE_MASK;
}
return (rd - last_update) & SIZE_MASK;
}
--
George Anzinger george@mvista.com
High-res-timers: http://sourceforge.net/projects/high-res-timers/
Preemption patch: http://www.kernel.org/pub/linux/kernel/people/rml
next prev parent reply other threads:[~2004-09-03 7:47 UTC|newest]
Thread overview: 86+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-09-02 21:07 [RFC] New Time of day proposal (updated 9/2/04) john stultz
2004-09-02 21:09 ` [RFC][PATCH] new timeofday core subsystem (v.A0) john stultz
2004-09-02 21:11 ` [RFC][PATCH] new timeofday i386 hooks (v.A0) john stultz
2004-09-02 21:12 ` [RFC][PATCH] new timeofday i386 timesources (v.A0) john stultz
2004-09-03 1:44 ` [RFC][PATCH] new timeofday i386 hooks (v.A0) George Anzinger
2004-09-03 2:06 ` john stultz
2004-09-03 8:07 ` Ulrich Windl
2004-09-03 18:09 ` George Anzinger
2004-09-02 22:19 ` [RFC][PATCH] new timeofday core subsystem (v.A0) Christoph Lameter
2004-09-02 22:28 ` john stultz
2004-09-02 22:42 ` Christoph Lameter
2004-09-02 23:14 ` john stultz
2004-09-02 23:39 ` Christoph Lameter
2004-09-03 0:07 ` john stultz
2004-09-03 0:47 ` Christoph Lameter
2004-09-03 1:30 ` john stultz
2004-09-03 7:43 ` George Anzinger [this message]
2004-09-03 19:32 ` john stultz
2004-09-03 16:18 ` Christoph Lameter
2004-09-03 21:00 ` john stultz
2004-09-03 22:04 ` Christoph Lameter
2004-09-03 23:00 ` john stultz
2004-09-04 0:11 ` Christoph Lameter
2004-09-03 1:39 ` George Anzinger
2004-09-03 1:58 ` john stultz
2004-09-03 6:42 ` Albert Cahalan
2004-09-03 7:24 ` George Anzinger
2004-09-03 19:27 ` john stultz
2004-09-03 22:10 ` George Anzinger
2004-09-03 23:32 ` john stultz
2004-09-04 0:02 ` George Anzinger
2004-09-08 18:07 ` john stultz
2004-09-09 0:08 ` George Anzinger
2004-09-09 0:51 ` john stultz
2004-09-09 3:14 ` Christoph Lameter
2004-09-09 3:32 ` john stultz
2004-09-09 4:31 ` George Anzinger
2004-09-09 6:37 ` Jesse Barnes
2004-09-09 8:09 ` George Anzinger
2004-09-09 19:07 ` john stultz
2004-09-09 20:49 ` George Anzinger
2004-09-13 21:29 ` Christoph Lameter
2004-09-13 22:25 ` john stultz
2004-09-13 22:45 ` Christoph Lameter
2004-09-14 6:53 ` Ulrich Windl
2004-09-14 17:49 ` Christoph Lameter
2004-09-15 0:57 ` George Anzinger
2004-09-15 3:32 ` Christoph Lameter
2004-09-15 8:04 ` George Anzinger
2004-09-15 8:54 ` Dominik Brodowski
2004-09-15 17:54 ` George Anzinger
2004-09-15 9:12 ` Andi Kleen
2004-09-15 15:46 ` Christoph Lameter
2004-09-15 18:00 ` George Anzinger
2004-09-15 18:28 ` Christoph Lameter
2004-09-15 6:46 ` Christoph Lameter
2004-09-15 16:32 ` john stultz
2004-09-15 16:46 ` Christoph Lameter
2004-09-15 17:13 ` john stultz
2004-09-15 17:30 ` Christoph Lameter
2004-09-15 18:48 ` john stultz
2004-09-15 19:58 ` George Anzinger
2004-09-15 20:20 ` Christoph Lameter
2004-09-16 7:02 ` Ulrich Windl
2004-09-03 19:18 ` john stultz
2004-09-02 22:09 ` [RFC] New Time of day proposal (updated 9/2/04) Christoph Lameter
2004-09-02 22:22 ` john stultz
2004-09-02 22:47 ` Christoph Lameter
2004-09-03 9:54 ` Dominik Brodowski
2004-09-03 19:41 ` john stultz
2004-09-03 20:26 ` Dominik Brodowski
2004-09-03 21:05 ` john stultz
2004-09-06 6:26 ` Ulrich Windl
2004-09-06 11:56 ` Alan Cox
2004-09-07 16:14 ` Christoph Lameter
2004-09-03 15:17 ` Andi Kleen
2004-09-03 20:11 ` john stultz
2004-09-04 13:00 ` Andi Kleen
2004-09-07 16:10 ` Christoph Lameter
2004-09-07 18:24 ` George Anzinger
2004-09-07 20:55 ` Christoph Lameter
2004-09-07 21:42 ` George Anzinger
2004-09-08 6:26 ` Ulrich Windl
2004-09-08 18:25 ` john stultz
[not found] <413850B9.15119.BA95FD@rkdvmks1.ngate.uni-regensburg.de>
[not found] ` <1094224071.431.7758.camel@cube>
2004-09-06 6:08 ` [RFC][PATCH] new timeofday core subsystem (v.A0) Ulrich Windl
2004-09-12 17:11 ` Albert Cahalan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=413820A2.80409@mvista.com \
--to=george@mvista.com \
--cc=Ulrich.Windl@rz.uni-regensburg.de \
--cc=ak@suse.de \
--cc=albert@users.sourceforge.net \
--cc=clameter@sgi.com \
--cc=davidm@hpl.hp.com \
--cc=gone@us.ibm.com \
--cc=greg@kroah.com \
--cc=jimix@us.ibm.com \
--cc=johnstul@us.ibm.com \
--cc=kmannth@us.ibm.com \
--cc=lcm@us.ibm.com \
--cc=len.brown@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux@dominikbrodowski.de \
--cc=paulus@samba.org \
--cc=schwidefsky@de.ibm.com \
--cc=tim@physik3.uni-rostock.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox