From: George Anzinger <george@mvista.com>
To: john stultz <johnstul@us.ibm.com>
Cc: Christoph Lameter <clameter@sgi.com>,
lkml <linux-kernel@vger.kernel.org>,
tim@physik3.uni-rostock.de, albert@users.sourceforge.net,
Ulrich.Windl@rz.uni-regensburg.de,
Len Brown <len.brown@intel.com>,
linux@dominikbrodowski.de, David Mosberger <davidm@hpl.hp.com>,
Andi Kleen <ak@suse.de>,
paulus@samba.org, schwidefsky@de.ibm.com, jimix@us.ibm.com,
keith maanthey <kmannth@us.ibm.com>, greg kh <greg@kroah.com>,
Patricia Gaughen <gone@us.ibm.com>,
Chris McDermott <lcm@us.ibm.com>
Subject: Re: [RFC][PATCH] new timeofday core subsystem (v.A0)
Date: Fri, 03 Sep 2004 00:43:30 -0700 [thread overview]
Message-ID: <413820A2.80409@mvista.com> (raw)
In-Reply-To: <1094175004.14662.440.camel@cog.beaverton.ibm.com>
john stultz wrote:
> On Thu, 2004-09-02 at 17:47, Christoph Lameter wrote:
>
>>On Thu, 2 Sep 2004, john stultz wrote:
>>
>>
>>>>Of course but its not a generic way of timer acccess and
>>>>requires a fastcall for each timer. You still have the problem of
>>>>exporting the frequency and the time offsets to user space (those also
>>>>need to be kept current in order for one to be able to calculate a timer
>>>>value!). The syscall/fastcall API then needs to be extended to allow for a
>>>>system call to access each of the individual timers supported.
>>>
>>>Indeed, it would require a fastcall accessor for each timesource, but
>>>maybe fastcall is the wrong way to describe it. Could it just be a
>>>function that uses the epc to escalate its privileges? As for freq and
>>>offsets, those would already be user-visible (see below for more
>>>details)
>>
>>The only way curent way to enter the kernel from glibc with a fastcall is
>>the EPC.
>
>
> Hmm. I must be explaining myself poorly, or not understanding you. I
> apologize for not understanding this EPC/fastcall business well enough.
> I'd like to use EPC from a user-executable kernel page to escalate
> privileges to access the hardware counter. I don't care if I have to use
> the the current fastcall (fsys.S) interface or not. However you're
> making sounds like this isn't possible, so I'll need to do some
> research.
>
>
>>>The plan at the moment is that the timeofday core gettimeofday code path
>>>as well as any timesource that supports it adds a _vsyscall linker
>>>attribute. Then the linker will place all the code on a special page or
>>>set of pages. Those pages would then be mapped in as user-readable. Then
>>>just like the x86-64's vsyscall gettimeofday, glibc would re-direct
>>>gettimeofday calls to the user-mapped kernel pages, where it would
>>>execute in user mode (with maybe the epc privilege escalation for ia64
>>>time sources that required it).
>>
>>The EPC call already does do a *secure* transfer like this on IA64 and
>>will execute kernel code without user space mapping. This idea raises all sorts
>>of concerns....
>
>
> Yes, but its not portable. Reducing duplicate code so timeofday
> maintenance isn't a nightmare is the first goal here. It may not be
> completely achievable, and when I hit that point I'll have to rework the
> design, but at this point I'm not convinced that it cannot be done.
>
> As for security concerns, all your mapping out to userspace are the time
> variables and the functions needed to convert those to a accurate time
> of day. This is almost what you suggest below, but with the additional
> math from the kernel. The method I'm suggesting is very similar to
> x86-64's arch/x86-64/kernel/vsyscall.c.
>
>
>
>>>I had to do most of this for the i386 vsyscall-gettimeofday port, but I
>>>was unhappy with the duplication of code (and bugs), as well as the fact
>>>that I was then being pushed to re-do it for each architecture. While
>>>its not currently implemented (I was hoping to keep the discussion
>>>focused on the correctness of what's been implemented), I feel the plan
>>>for user-mode access won't be too complex. I'm still open for further
>>>discussion if you'd like, obviously performance is quite important, and
>>>I want to calm any fears you have, but I'm sure the new ntp code plenty
>>>of performance issues to look at before we start digging into usermode
>>>access, so maybe we can come back to this discussion later?
>>
>>The functions in the timer source structure is a central problem for IA64. We
>>cannot take that out right now.
>
>
> Don't worry I'm not submitting this code just yet. :) I'll need all (or
> at least most of the important) architecture maintainers to sign on
> before I try to push this code in.
>
> Right now I'm trying to shake out possible problems with the design and
> the first pass implementation of the code. You're helping me do that,
> and I thank you for it. Concessions will be made, but for now I'm going
> to try to preserve the current design and work around the problems as
> they arise.
>
>
>
>>The full parameterization of timer access as I have suggested also allows
>>user page access to timers with simply mapping a page to access the timer.
>>No other gimmicks are needed since the timer source structure contains all
>>information to calculate a timer offset.
>>
>>
>>>Yes, but x86-64 has one way, and ia64 does it another, and i know ppc
>>>folks have talked about their own user mode time access. Chasing down a
>>>time bug across arches gets to be fairly hairy, so I'm trying to
>>>simplify that.
>>
>>The simplest thins is to provide a data structure without any functions
>>attached that can simply be copied into userspace if wanted. If an arch
>>needs special time access then that is depending on the arch specific
>>methods available and such a data structure as I have proposed will
>>include all the info necessary to implement such user mode time access.
>
>
> Ehhh.. I really don't like the idea of giving all the raw values to
> userspace and letting user-code do the timeofday calculation. Fixing
> bugs in each arches timeofday code is hard enough. Imagine if we have to
> go through and fix userspace too! It would also make a user/kernel data
> interface that we'd have to preserve. I'd like to avoid that and instead
> use the vsyscall method to give us greater flexibility. Plus I doubt
> anyone would want to implement the NTP adjustments in userspace? eek!
>
>
>>A requirement to call functions in the kernel to determine time makes
>>these solution impossible. And its getting extremely complex if one has to
>>invoke different functions for each timer supported.
>
>
> The struct timesource interface of read()/delta()/cyc2ns() was the best
> generalization I could come up with. I feel they're necessary for the
> following reasons:
>
> cyc2ns(): In this conversion we can optimize the math depending on the
> timesource. If the timesource freq is a power of 2, we can just use
> shift! However if its a weird value and we have to be very precise, we
> do a full 64bit divide. We're not stuck ith one equation given a freq
> value.
>
> delta(): Some counters don't fill 32 or 64 bits. ACPI PM time source is
> 24 bits, and the cyclone is 40. Thus to do proper twos complement
> subtraction without overflow worries you need to mask the subtraction.
> This can be done by exporting a mask value w/ the freq value, but was
> cleaner when moved into the timesource.
>
> read(): Rather then just giving the address of the register, the read
> call allows for timesource specific logic. This lets us use jiffies as a
> timesource, or in cases like the ACPI PM timesource, where the register
> must be read 3 times in order to ensure a correct value is latched, we
> can avoid having to include that logic into the generic code, so it does
> not affect systems that do not use or have that timesource.
I am not convince that 3 reads are in fact needed. In fact, I am amost certain
that two is more than enough. In fact, it takes so long to read it that I just
use one read and a sanity check in the HRT code. Here is the code I use:
unsigned long
quick_get_cpuctr(void)
{
static unsigned long last_read = 0;
static int qgc_max = 0;
int i;
unsigned long rd_delta, rd_ans, rd = inl(acpi_pm_tmr_address);
/*
* This will be REALLY big if ever we move backward in time...
*/
rd_delta = (rd - last_read) & SIZE_MASK;
last_read = rd;
rd_ans = (rd - last_update) & SIZE_MASK;
if (likely((rd_ans < (arch_cycles_per_jiffy << 1)) &&
(rd_delta < (arch_cycles_per_jiffy << 1))))
return rd_ans;
for (i = 0; i < 10; i++) {
rd = inl(acpi_pm_tmr_address);
rd_delta = (rd - last_read) & SIZE_MASK;
last_read = rd;
if (unlikely(i > qgc_max))
qgc_max = i;
/*
* On my test machine (800MHZ dual PIII) this is always
* seven. Seems long, but we will give it some slack...
* We note that rd_delta (and all the vars) unsigned so
* a backward movement will show as a really big number.
*/
if (likely(rd_delta < 20))
return (rd - last_update) & SIZE_MASK;
}
return (rd - last_update) & SIZE_MASK;
}
--
George Anzinger george@mvista.com
High-res-timers: http://sourceforge.net/projects/high-res-timers/
Preemption patch: http://www.kernel.org/pub/linux/kernel/people/rml
next prev parent reply other threads:[~2004-09-03 7:47 UTC|newest]
Thread overview: 86+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-09-02 21:07 [RFC] New Time of day proposal (updated 9/2/04) john stultz
2004-09-02 21:09 ` [RFC][PATCH] new timeofday core subsystem (v.A0) john stultz
2004-09-02 21:11 ` [RFC][PATCH] new timeofday i386 hooks (v.A0) john stultz
2004-09-02 21:12 ` [RFC][PATCH] new timeofday i386 timesources (v.A0) john stultz
2004-09-03 1:44 ` [RFC][PATCH] new timeofday i386 hooks (v.A0) George Anzinger
2004-09-03 2:06 ` john stultz
2004-09-03 8:07 ` Ulrich Windl
2004-09-03 18:09 ` George Anzinger
2004-09-02 22:19 ` [RFC][PATCH] new timeofday core subsystem (v.A0) Christoph Lameter
2004-09-02 22:28 ` john stultz
2004-09-02 22:42 ` Christoph Lameter
2004-09-02 23:14 ` john stultz
2004-09-02 23:39 ` Christoph Lameter
2004-09-03 0:07 ` john stultz
2004-09-03 0:47 ` Christoph Lameter
2004-09-03 1:30 ` john stultz
2004-09-03 7:43 ` George Anzinger [this message]
2004-09-03 19:32 ` john stultz
2004-09-03 16:18 ` Christoph Lameter
2004-09-03 21:00 ` john stultz
2004-09-03 22:04 ` Christoph Lameter
2004-09-03 23:00 ` john stultz
2004-09-04 0:11 ` Christoph Lameter
2004-09-03 1:39 ` George Anzinger
2004-09-03 1:58 ` john stultz
2004-09-03 6:42 ` Albert Cahalan
2004-09-03 7:24 ` George Anzinger
2004-09-03 19:27 ` john stultz
2004-09-03 22:10 ` George Anzinger
2004-09-03 23:32 ` john stultz
2004-09-04 0:02 ` George Anzinger
2004-09-08 18:07 ` john stultz
2004-09-09 0:08 ` George Anzinger
2004-09-09 0:51 ` john stultz
2004-09-09 3:14 ` Christoph Lameter
2004-09-09 3:32 ` john stultz
2004-09-09 4:31 ` George Anzinger
2004-09-09 6:37 ` Jesse Barnes
2004-09-09 8:09 ` George Anzinger
2004-09-09 19:07 ` john stultz
2004-09-09 20:49 ` George Anzinger
2004-09-13 21:29 ` Christoph Lameter
2004-09-13 22:25 ` john stultz
2004-09-13 22:45 ` Christoph Lameter
2004-09-14 6:53 ` Ulrich Windl
2004-09-14 17:49 ` Christoph Lameter
2004-09-15 0:57 ` George Anzinger
2004-09-15 3:32 ` Christoph Lameter
2004-09-15 8:04 ` George Anzinger
2004-09-15 8:54 ` Dominik Brodowski
2004-09-15 17:54 ` George Anzinger
2004-09-15 9:12 ` Andi Kleen
2004-09-15 15:46 ` Christoph Lameter
2004-09-15 18:00 ` George Anzinger
2004-09-15 18:28 ` Christoph Lameter
2004-09-15 6:46 ` Christoph Lameter
2004-09-15 16:32 ` john stultz
2004-09-15 16:46 ` Christoph Lameter
2004-09-15 17:13 ` john stultz
2004-09-15 17:30 ` Christoph Lameter
2004-09-15 18:48 ` john stultz
2004-09-15 19:58 ` George Anzinger
2004-09-15 20:20 ` Christoph Lameter
2004-09-16 7:02 ` Ulrich Windl
2004-09-03 19:18 ` john stultz
2004-09-02 22:09 ` [RFC] New Time of day proposal (updated 9/2/04) Christoph Lameter
2004-09-02 22:22 ` john stultz
2004-09-02 22:47 ` Christoph Lameter
2004-09-03 9:54 ` Dominik Brodowski
2004-09-03 19:41 ` john stultz
2004-09-03 20:26 ` Dominik Brodowski
2004-09-03 21:05 ` john stultz
2004-09-06 6:26 ` Ulrich Windl
2004-09-06 11:56 ` Alan Cox
2004-09-07 16:14 ` Christoph Lameter
2004-09-03 15:17 ` Andi Kleen
2004-09-03 20:11 ` john stultz
2004-09-04 13:00 ` Andi Kleen
2004-09-07 16:10 ` Christoph Lameter
2004-09-07 18:24 ` George Anzinger
2004-09-07 20:55 ` Christoph Lameter
2004-09-07 21:42 ` George Anzinger
2004-09-08 6:26 ` Ulrich Windl
2004-09-08 18:25 ` john stultz
[not found] <413850B9.15119.BA95FD@rkdvmks1.ngate.uni-regensburg.de>
[not found] ` <1094224071.431.7758.camel@cube>
2004-09-06 6:08 ` [RFC][PATCH] new timeofday core subsystem (v.A0) Ulrich Windl
2004-09-12 17:11 ` Albert Cahalan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=413820A2.80409@mvista.com \
--to=george@mvista.com \
--cc=Ulrich.Windl@rz.uni-regensburg.de \
--cc=ak@suse.de \
--cc=albert@users.sourceforge.net \
--cc=clameter@sgi.com \
--cc=davidm@hpl.hp.com \
--cc=gone@us.ibm.com \
--cc=greg@kroah.com \
--cc=jimix@us.ibm.com \
--cc=johnstul@us.ibm.com \
--cc=kmannth@us.ibm.com \
--cc=lcm@us.ibm.com \
--cc=len.brown@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux@dominikbrodowski.de \
--cc=paulus@samba.org \
--cc=schwidefsky@de.ibm.com \
--cc=tim@physik3.uni-rostock.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.