From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrew Cooper Subject: Re: [PATCH v6 1/5] x86: allow reading MSR_IA32_TSC with XENPF_resource_op Date: Fri, 23 Jan 2015 14:57:18 +0000 Message-ID: <54C2614E.6070706@citrix.com> References: <1422020420-23115-1-git-send-email-chao.p.peng@linux.intel.com> <1422020420-23115-2-git-send-email-chao.p.peng@linux.intel.com> <54C268840200007800058DA2@mail.emea.novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <54C268840200007800058DA2@mail.emea.novell.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Jan Beulich , Chao Peng Cc: wei.liu2@citrix.com, Ian.Campbell@citrix.com, stefano.stabellini@eu.citrix.com, Ian.Jackson@eu.citrix.com, xen-devel@lists.xen.org, will.auld@intel.com, keir@xen.org List-Id: xen-devel@lists.xenproject.org On 23/01/15 14:28, Jan Beulich wrote: >>>> On 23.01.15 at 14:40, wrote: >> @@ -133,10 +135,39 @@ static void resource_access(void *info) >> switch ( entry->u.cmd ) >> { >> case XEN_RESOURCE_OP_MSR_READ: >> - ret = rdmsr_safe(entry->idx, entry->val); >> + if ( unlikely(entry->idx == MSR_IA32_TSC) ) { >> + /* Return scaled time instead of raw timestamp */ >> + entry->val = get_s_time_fixed(tsc); > This is going to be bogus when happening on the first entry. > Either disallow it, or rdtscll() here if tsc == 0. > >> + ret = 0; >> + } >> + else >> + { >> + unsigned long irqflags; >> + /* >> + * If next entry is MSR_IA32_TSC read, then the actual rdtscll >> + * is performed together with current entry, with IRQ disabled. >> + */ >> + bool_t read_tsc = (i < ra->nr_done - 1 && >> + unlikely(entry[1].idx == MSR_IA32_TSC && >> + entry[1].u.cmd == XEN_RESOURCE_OP_MSR_READ)); > Just like you do the rdtscll() without regard to rc (which is fine), > I don't think you need that last part of the condition. > >> --- a/xen/include/public/platform.h >> +++ b/xen/include/public/platform.h >> @@ -540,6 +540,16 @@ DEFINE_XEN_GUEST_HANDLE(xenpf_core_parking_t); >> #define XEN_RESOURCE_OP_MSR_READ 0 >> #define XEN_RESOURCE_OP_MSR_WRITE 1 >> >> +/* >> + * Specially handled MSRs: >> + * - MSR_IA32_TSC >> + * READ: Returns the scaled system time(ns) instead of raw timestamp. In >> + * multiple entry case, if other MSR read is followed by a MSR_IA32_TSC >> + * read, then both reads are guaranteed to be performed atomically (with >> + * IRQ disabled). The return time indicates the point of reading that MSR. >> + * WRITE: Not supported. >> + */ > So before adding this I'd really like to once again understand what > the consumer of this is going to use this for: The scaled system time > normally isn't very useful to user mode code, hence whether we > return ticks or nanoseconds doesn't seem to make a big difference - > unless user mode code is expected to only ever look at the delta of > two such values. In which case I'd consider obfuscating the real > value by some artificial (and perhaps randomized at boot time) bias. A delta is precisely the usecase. Calculating the actual memory bandwidth requires two MSR samples and the time in between them. Originally, the time was calculated with a usleep() library call but I objected to this because of scheduling getting in the way of measuring an accurate time delta. ~Andrew