public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: George Anzinger <george@mvista.com>
To: john stultz <johnstul@us.ibm.com>
Cc: Christoph Lameter <clameter@sgi.com>,
	lkml <linux-kernel@vger.kernel.org>,
	tim@physik3.uni-rostock.de, albert@users.sourceforge.net,
	Ulrich.Windl@rz.uni-regensburg.de,
	Len Brown <len.brown@intel.com>,
	linux@dominikbrodowski.de, David Mosberger <davidm@hpl.hp.com>,
	Andi Kleen <ak@suse.de>,
	paulus@samba.org, schwidefsky@de.ibm.com, jimix@us.ibm.com,
	keith maanthey <kmannth@us.ibm.com>, greg kh <greg@kroah.com>,
	Patricia Gaughen <gone@us.ibm.com>,
	Chris McDermott <lcm@us.ibm.com>
Subject: Re: [RFC][PATCH] new timeofday core subsystem (v.A0)
Date: Fri, 03 Sep 2004 00:43:30 -0700	[thread overview]
Message-ID: <413820A2.80409@mvista.com> (raw)
In-Reply-To: <1094175004.14662.440.camel@cog.beaverton.ibm.com>

john stultz wrote:
> On Thu, 2004-09-02 at 17:47, Christoph Lameter wrote:
> 
>>On Thu, 2 Sep 2004, john stultz wrote:
>>
>>
>>>>Of course but its not a generic way of timer acccess and
>>>>requires a fastcall for each timer. You still have the problem of
>>>>exporting the frequency and the time offsets to user space (those also
>>>>need to be kept current in order for one to be able to calculate a timer
>>>>value!). The syscall/fastcall API then needs to be extended to allow for a
>>>>system call to access each of the individual timers supported.
>>>
>>>Indeed, it would require a fastcall accessor for each timesource, but
>>>maybe fastcall is the wrong way to describe it. Could it just be a
>>>function that uses the epc to escalate its privileges? As for freq and
>>>offsets, those would already be user-visible (see below for more
>>>details)
>>
>>The only way curent way to enter the kernel from glibc with a fastcall is
>>the EPC.
> 
> 
> Hmm. I must be explaining myself poorly, or not understanding you. I
> apologize for not understanding this EPC/fastcall business well enough.
> I'd like to use EPC from a user-executable kernel page to escalate
> privileges to access the hardware counter. I don't care if I have to use
> the the current fastcall (fsys.S) interface or not. However you're
> making sounds like this isn't possible, so I'll need to do some
> research. 
> 
> 
>>>The plan at the moment is that the timeofday core gettimeofday code path
>>>as well as any timesource that supports it adds a _vsyscall linker
>>>attribute. Then the linker will place all the code on a special page or
>>>set of pages. Those pages would then be mapped in as user-readable. Then
>>>just like the x86-64's vsyscall gettimeofday, glibc would re-direct
>>>gettimeofday calls to the user-mapped kernel pages, where it would
>>>execute in user mode (with maybe the epc privilege escalation for ia64
>>>time sources that required it).
>>
>>The EPC call already does do a *secure* transfer like this on IA64 and
>>will execute kernel code without user space mapping. This idea raises all sorts
>>of concerns....
> 
> 
> Yes, but its not portable. Reducing duplicate code so timeofday
> maintenance isn't a nightmare is the first goal here. It may not be
> completely achievable, and when I hit that point I'll have to rework the
> design, but at this point I'm not convinced that it cannot be done.
> 
> As for security concerns, all your mapping out to userspace are the time
> variables and the functions needed to convert those to a accurate time
> of day. This is almost what you suggest below, but with the additional
> math from the kernel. The method I'm suggesting is very similar to
> x86-64's arch/x86-64/kernel/vsyscall.c.
> 
> 
> 
>>>I had to do most of this for the i386 vsyscall-gettimeofday port, but I
>>>was unhappy with the duplication of code (and bugs), as well as the fact
>>>that I was then being pushed to re-do it for each architecture. While
>>>its not currently implemented (I was hoping to keep the discussion
>>>focused on the correctness of what's been implemented), I feel the plan
>>>for user-mode access won't be too complex. I'm still open for further
>>>discussion if you'd like, obviously performance is quite important, and
>>>I want to calm any fears you have, but I'm sure the new ntp code plenty
>>>of performance issues to look at before we start digging into usermode
>>>access, so maybe we can come back to this discussion later?
>>
>>The functions in the timer source structure is a central problem for IA64. We
>>cannot take that out right now.
> 
> 
> Don't worry I'm not submitting this code just yet. :) I'll need all (or
> at least most of the important) architecture maintainers to sign on
> before I try to push this code in.
> 
> Right now I'm trying to shake out possible problems with the design and
> the first pass implementation of the code. You're helping me do that,
> and I thank you for it. Concessions will be made, but for now I'm going
> to try to preserve the current design and work around the problems as
> they arise. 
> 
> 
> 
>>The full parameterization of timer access as I have suggested also allows
>>user page access to timers with simply mapping a page to access the timer.
>>No other gimmicks are needed since the timer source structure contains all
>>information to calculate a timer offset.
>>
>>
>>>Yes, but x86-64 has one way, and ia64 does it another, and i know ppc
>>>folks have talked about their own user mode time access. Chasing down a
>>>time bug across arches gets to be fairly hairy, so I'm trying to
>>>simplify that.
>>
>>The simplest thins is to provide a data structure without any functions
>>attached that can simply be copied into userspace if wanted. If an arch
>>needs special time access then that is depending on the arch specific
>>methods available and such a data structure as I have proposed will
>>include all the info necessary to implement such user mode time access.
> 
> 
> Ehhh.. I really don't like the idea of giving all the raw values to
> userspace and letting user-code do the timeofday calculation. Fixing
> bugs in each arches timeofday code is hard enough. Imagine if we have to
> go through and fix userspace too! It would also make a user/kernel data
> interface that we'd have to preserve. I'd like to avoid that and instead
> use the vsyscall method to give us greater flexibility. Plus I doubt
> anyone would want to implement the NTP adjustments in userspace? eek!
> 
> 
>>A requirement to call functions in the kernel to determine time makes
>>these solution impossible. And its getting extremely complex if one has to
>>invoke different functions for each timer supported.
> 
> 
> The struct timesource interface of read()/delta()/cyc2ns() was the best
> generalization I could come up with. I feel they're necessary for the
> following reasons:
> 	
> cyc2ns(): In this conversion we can optimize the math depending on the
> timesource. If the timesource freq is a power of 2, we can just use
> shift! However if its a weird value and we have to be very precise, we
> do a full 64bit divide. We're not stuck ith one equation given a freq
> value.
> 
> delta(): Some counters don't fill 32 or 64 bits. ACPI PM time source is
> 24 bits, and the cyclone is 40. Thus to do proper twos complement
> subtraction without overflow worries you need to mask the subtraction.
> This can be done by exporting a mask value w/ the freq value, but was
> cleaner when moved into the timesource. 
> 
> read(): Rather then just giving the address of the register, the read
> call allows for timesource specific logic. This lets us use jiffies as a
> timesource, or in cases like the ACPI PM timesource, where the register
> must be read 3 times in order to ensure a correct value is latched, we
> can avoid having to include that logic into the generic code, so it does
> not affect systems that do not use or have that timesource.

I am not convince that 3 reads are in fact needed.  In fact, I am amost certain 
that two is more than enough.  In fact, it takes so long to read it that I just 
use one read and a sanity check in the HRT code.  Here is the code I use:

unsigned long
quick_get_cpuctr(void)
{
	static  unsigned long last_read = 0;
	static  int qgc_max = 0;
	int i;

	unsigned long rd_delta, rd_ans, rd = inl(acpi_pm_tmr_address);

	/*
	 * This will be REALLY big if ever we move backward in time...
	 */
	rd_delta = (rd - last_read) & SIZE_MASK;
	last_read = rd;

	rd_ans =  (rd - last_update) & SIZE_MASK;

	if (likely((rd_ans < (arch_cycles_per_jiffy << 1)) &&
		   (rd_delta < (arch_cycles_per_jiffy << 1))))
		return rd_ans;

	for (i = 0; i < 10; i++) {
		rd = inl(acpi_pm_tmr_address);
		rd_delta = (rd - last_read) & SIZE_MASK;
		last_read = rd;
		if (unlikely(i > qgc_max))
			qgc_max = i;
		/*
		 * On my test machine (800MHZ dual PIII) this is always
		 * seven.  Seems long, but we will give it some slack...
		 * We note that rd_delta (and all the vars) unsigned so
		 * a backward movement will show as a really big number.
		 */
		if (likely(rd_delta < 20))
			return (rd - last_update) & SIZE_MASK;
	}
	return (rd - last_update) & SIZE_MASK;
}


-- 
George Anzinger   george@mvista.com
High-res-timers:  http://sourceforge.net/projects/high-res-timers/
Preemption patch: http://www.kernel.org/pub/linux/kernel/people/rml


  reply	other threads:[~2004-09-03  7:47 UTC|newest]

Thread overview: 86+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-09-02 21:07 [RFC] New Time of day proposal (updated 9/2/04) john stultz
2004-09-02 21:09 ` [RFC][PATCH] new timeofday core subsystem (v.A0) john stultz
2004-09-02 21:11   ` [RFC][PATCH] new timeofday i386 hooks (v.A0) john stultz
2004-09-02 21:12     ` [RFC][PATCH] new timeofday i386 timesources (v.A0) john stultz
2004-09-03  1:44     ` [RFC][PATCH] new timeofday i386 hooks (v.A0) George Anzinger
2004-09-03  2:06       ` john stultz
2004-09-03  8:07       ` Ulrich Windl
2004-09-03 18:09         ` George Anzinger
2004-09-02 22:19   ` [RFC][PATCH] new timeofday core subsystem (v.A0) Christoph Lameter
2004-09-02 22:28     ` john stultz
2004-09-02 22:42       ` Christoph Lameter
2004-09-02 23:14         ` john stultz
2004-09-02 23:39           ` Christoph Lameter
2004-09-03  0:07             ` john stultz
2004-09-03  0:47               ` Christoph Lameter
2004-09-03  1:30                 ` john stultz
2004-09-03  7:43                   ` George Anzinger [this message]
2004-09-03 19:32                     ` john stultz
2004-09-03 16:18                   ` Christoph Lameter
2004-09-03 21:00                     ` john stultz
2004-09-03 22:04                       ` Christoph Lameter
2004-09-03 23:00                         ` john stultz
2004-09-04  0:11                           ` Christoph Lameter
2004-09-03  1:39   ` George Anzinger
2004-09-03  1:58     ` john stultz
2004-09-03  6:42     ` Albert Cahalan
2004-09-03  7:24       ` George Anzinger
2004-09-03 19:27         ` john stultz
2004-09-03 22:10           ` George Anzinger
2004-09-03 23:32             ` john stultz
2004-09-04  0:02               ` George Anzinger
2004-09-08 18:07                 ` john stultz
2004-09-09  0:08                   ` George Anzinger
2004-09-09  0:51                     ` john stultz
2004-09-09  3:14                       ` Christoph Lameter
2004-09-09  3:32                         ` john stultz
2004-09-09  4:31                           ` George Anzinger
2004-09-09  6:37                             ` Jesse Barnes
2004-09-09  8:09                               ` George Anzinger
2004-09-09 19:07                             ` john stultz
2004-09-09 20:49                               ` George Anzinger
2004-09-13 21:29                                 ` Christoph Lameter
2004-09-13 22:25                                   ` john stultz
2004-09-13 22:45                                     ` Christoph Lameter
2004-09-14  6:53                                       ` Ulrich Windl
2004-09-14 17:49                                     ` Christoph Lameter
2004-09-15  0:57                                       ` George Anzinger
2004-09-15  3:32                                         ` Christoph Lameter
2004-09-15  8:04                                           ` George Anzinger
2004-09-15  8:54                                             ` Dominik Brodowski
2004-09-15 17:54                                               ` George Anzinger
2004-09-15  9:12                                             ` Andi Kleen
2004-09-15 15:46                                             ` Christoph Lameter
2004-09-15 18:00                                               ` George Anzinger
2004-09-15 18:28                                                 ` Christoph Lameter
2004-09-15  6:46                                         ` Christoph Lameter
2004-09-15 16:32                                           ` john stultz
2004-09-15 16:46                                             ` Christoph Lameter
2004-09-15 17:13                                               ` john stultz
2004-09-15 17:30                                                 ` Christoph Lameter
2004-09-15 18:48                                                   ` john stultz
2004-09-15 19:58                                                     ` George Anzinger
2004-09-15 20:20                                                     ` Christoph Lameter
2004-09-16  7:02                                                     ` Ulrich Windl
2004-09-03 19:18       ` john stultz
2004-09-02 22:09 ` [RFC] New Time of day proposal (updated 9/2/04) Christoph Lameter
2004-09-02 22:22   ` john stultz
2004-09-02 22:47     ` Christoph Lameter
2004-09-03  9:54 ` Dominik Brodowski
2004-09-03 19:41   ` john stultz
2004-09-03 20:26     ` Dominik Brodowski
2004-09-03 21:05       ` john stultz
2004-09-06  6:26       ` Ulrich Windl
2004-09-06 11:56         ` Alan Cox
2004-09-07 16:14         ` Christoph Lameter
2004-09-03 15:17 ` Andi Kleen
2004-09-03 20:11   ` john stultz
2004-09-04 13:00     ` Andi Kleen
2004-09-07 16:10       ` Christoph Lameter
2004-09-07 18:24         ` George Anzinger
2004-09-07 20:55           ` Christoph Lameter
2004-09-07 21:42             ` George Anzinger
2004-09-08  6:26           ` Ulrich Windl
2004-09-08 18:25       ` john stultz
     [not found] <413850B9.15119.BA95FD@rkdvmks1.ngate.uni-regensburg.de>
     [not found] ` <1094224071.431.7758.camel@cube>
2004-09-06  6:08   ` [RFC][PATCH] new timeofday core subsystem (v.A0) Ulrich Windl
2004-09-12 17:11     ` Albert Cahalan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=413820A2.80409@mvista.com \
    --to=george@mvista.com \
    --cc=Ulrich.Windl@rz.uni-regensburg.de \
    --cc=ak@suse.de \
    --cc=albert@users.sourceforge.net \
    --cc=clameter@sgi.com \
    --cc=davidm@hpl.hp.com \
    --cc=gone@us.ibm.com \
    --cc=greg@kroah.com \
    --cc=jimix@us.ibm.com \
    --cc=johnstul@us.ibm.com \
    --cc=kmannth@us.ibm.com \
    --cc=lcm@us.ibm.com \
    --cc=len.brown@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@dominikbrodowski.de \
    --cc=paulus@samba.org \
    --cc=schwidefsky@de.ibm.com \
    --cc=tim@physik3.uni-rostock.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox