All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andi Kleen <ak@suse.de>
To: Suleiman Souhlal <ssouhlal@freebsd.org>
Cc: Linux Kernel ML <linux-kernel@vger.kernel.org>,
	vojtech@suse.cz, Jiri Bohac <jbohac@suse.cz>
Subject: Re: [PATCH 1/2] Make the TSC safe to be used by gettimeofday().
Date: Tue, 14 Nov 2006 03:05:00 +0100	[thread overview]
Message-ID: <200611140305.00383.ak@suse.de> (raw)
In-Reply-To: <45591706.3030106@FreeBSD.org>

On Tuesday 14 November 2006 02:08, Suleiman Souhlal wrote:
> This is done by a per-cpu vxtime structure that stores the last TSC and HPET
> values.
> 
> Whenever we switch to a userland process after a HLT instruction has been
> executed or after the CPU frequency has changed, we force a new read of the
> TSC, HPET and xtime so that we know the correct frequency we have to deal
> with.

Hmm, interesting approach. 

> 
> With this, we can safely use RDTSC in gettimeofday() in CPUs where the
> TSCs are not synchronized, such as Opterons, instead of doing a very expensive
> HPET read.

> +	cpu = hard_smp_processor_id();

Why not smp_processor_id ? 


> +
> +	apicid = hard_smp_processor_id();

The apicid <-> linux cpuid mapping should be 1:1 so i don't know
why you do this separately. All the uses of hard_smp_processor_id
are bogus.

> +
> +	/*
> +	 * We will need to also do this when switching to kernel tasks if we
> +	 * want to use the per-cpu monotonic_clock in the kernel
> +	 */
> +	if (vxtime.pcpu[apicid].need_update == 1 && next_p->mm != NULL)
> +		vxtime_update_pcpu();

Why? It should be really only needed on HLT/cpufreq change, or?

Anyways, I'm not very fond of adding code to the context switch critical
path.


>  		seq = read_seqbegin(&xtime_lock);
> +		preempt_disable();
> +		cpu = hard_smp_processor_id();

Again, shouldn't use hard

>  
> -		sec = xtime.tv_sec;
> -		usec = xtime.tv_nsec / NSEC_PER_USEC;
> +		sec = vxtime.pcpu[cpu].tv_sec;
> +		usec = vxtime.pcpu[cpu].tv_usec;
>  
>  		/* i386 does some correction here to keep the clock 
>  		   monotonous even when ntpd is fixing drift.
> @@ -135,9 +138,13 @@ void do_gettimeofday(struct timeval *tv)
>  		   be found. Note when you fix it here you need to do the same
>  		   in arch/x86_64/kernel/vsyscall.c and export all needed
>  		   variables in vmlinux.lds. -AK */ 
> -		usec += do_gettimeoffset();
> +		t = get_cycles_sync();
> +		x = (((t - vxtime.pcpu[cpu].last_tsc) *
> +		    vxtime.pcpu[cpu].tsc_nsquot) >> NS_SCALE) / NSEC_PER_USEC;
> +		usec += x;
>  
>  	} while (read_seqretry(&xtime_lock, seq));
> +	preempt_enable();

If it can be implemented without races in user space, why does
it need preempt disable in kernel space?

The faster way to access all the vxtime code would be to stick
a pointer to the per CPU vxtime data into the PDA


> +#define NSEC_PER_TICK (NSEC_PER_SEC / HZ)
> +extern unsigned long hpet_tick;
> +
> +extern unsigned long vxtime_hz;

No externs in .c files. Happened earlier too I think

> +
>  static __always_inline void timeval_normalize(struct timeval * tv)
>  {
>  	time_t __sec;
> @@ -57,35 +67,107 @@ static __always_inline void timeval_norm
>  	}
>  }
>  
> +inline int apicid(void)
> +{
> +	int cpu;
> +
> +	__asm __volatile("cpuid" : "=b" (cpu) : "a" (1) : "cx", "dx");
> +	return (cpu >> 24);

The faster way to do this is to use LSL from a magic GDT entry.

> +}
> +
>  static __always_inline void do_vgettimeofday(struct timeval * tv)
>  {
>  	long sequence, t;
>  	unsigned long sec, usec;
> +	int cpu;
>  
>  	do {
>  		sequence = read_seqbegin(&__xtime_lock);
> -		
> -		sec = __xtime.tv_sec;
> -		usec = __xtime.tv_nsec / 1000;
> -
> -		if (__vxtime.mode != VXTIME_HPET) {
> -			t = get_cycles_sync();
> -			if (t < __vxtime.last_tsc)
> -				t = __vxtime.last_tsc;
> -			usec += ((t - __vxtime.last_tsc) *
> -				 __vxtime.tsc_quot) >> 32;
> -			/* See comment in x86_64 do_gettimeofday. */
> -		} else {
> -			usec += ((readl((void __iomem *)
> -				   fix_to_virt(VSYSCALL_HPET) + 0xf0) -
> -				  __vxtime.last) * __vxtime.quot) >> 32;
> -		}
> -	} while (read_seqretry(&__xtime_lock, sequence));
> +		cpu = apicid();
> +
> +		sec = __vxtime.pcpu[cpu].tv_sec;
> +		usec = __vxtime.pcpu[cpu].tv_usec;
> +		rdtscll(t);
> +
> +		usec += (((t - __vxtime.pcpu[cpu].last_tsc) *
> +		    __vxtime.pcpu[cpu].tsc_nsquot) >> NS_SCALE) / NSEC_PER_USEC;
> +	} while (read_seqretry(&__xtime_lock, sequence) || apicid() != cpu);

Hmm, isn't there still a (unlikely, but possible) race: 

         CPU #0                          CPU #1
         read cpu number
 	 switch to CPU #1
                                         read TSC 
                                         switch back to CPU #0
         check cpu number
         check succeeds, but you got wrong TSC data

How do you prevent that? I suppose you could force the seqlock to retry
in this case in the context switch, but I don't think you do that? 

Also I'm not sure what your context switch code is good for.

-Andi

  reply	other threads:[~2006-11-14  2:05 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-11-14  1:06 [PATCH 0/2] Make the TSC safe to be used by gettimeofday() Suleiman Souhlal
2006-11-14  1:08 ` [PATCH 1/2] " Suleiman Souhlal
2006-11-14  2:05   ` Andi Kleen [this message]
2006-11-14  2:25     ` Suleiman Souhlal
2006-11-14  2:44       ` Andi Kleen
2006-11-14  3:35         ` dean gaudet
2006-11-14  4:22           ` dean gaudet
2006-11-14  3:54         ` Suleiman Souhlal
2006-11-14 11:12           ` Andi Kleen
2006-11-14  1:09 ` [PATCH 2/2] Introduce a vmonotonic_clock() vsyscall Suleiman Souhlal
2006-11-14  1:50 ` [PATCH 0/2] Make the TSC safe to be used by gettimeofday() Andi Kleen
2006-11-14  2:06   ` Suleiman Souhlal
2006-11-14 11:10     ` [PATCH 0/2] Make the TSC safe to be used by gettimeofday() II Andi Kleen
2006-11-14 12:30     ` [PATCH 0/2] Make the TSC safe to be used by gettimeofday() Shem Multinymous
2006-11-14 17:06       ` Suleiman Souhlal
2006-11-14 18:30         ` Andi Kleen
2006-11-14 21:28     ` Christoph Lameter
2006-11-14  7:42   ` Arjan van de Ven

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200611140305.00383.ak@suse.de \
    --to=ak@suse.de \
    --cc=jbohac@suse.cz \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ssouhlal@freebsd.org \
    --cc=vojtech@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.