From mboxrd@z Thu Jan  1 00:00:00 1970
From: Andrew Cooper <andrew.cooper3@citrix.com>
Subject: Re: [PATCH v20 10/10] tools: CMDs and APIs for Cache
 Monitoring Technology
Date: Mon, 6 Oct 2014 15:27:38 +0100
Message-ID: <5432A6DA.2030401@citrix.com>
References: <1412337315-15609-1-git-send-email-chao.p.peng@linux.intel.com>
	<1412337315-15609-11-git-send-email-chao.p.peng@linux.intel.com>
	<20141003124956.GB7627@zion.uk.xensource.com>
	<20141006133253.GA25440@pengc-linux>
	<20141006135502.GC7627@zion.uk.xensource.com>
	<20141006141828.GC25440@pengc-linux>
	<20141006142435.GE7627@zion.uk.xensource.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Return-path: <xen-devel-bounces@lists.xen.org>
In-Reply-To: <20141006142435.GE7627@zion.uk.xensource.com>
List-Unsubscribe: <http://lists.xen.org/cgi-bin/mailman/options/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xen.org>
List-Help: <mailto:xen-devel-request@lists.xen.org?subject=help>
List-Subscribe: <http://lists.xen.org/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=subscribe>
Sender: xen-devel-bounces@lists.xen.org
Errors-To: xen-devel-bounces@lists.xen.org
To: Wei Liu <wei.liu2@citrix.com>, Chao Peng <chao.p.peng@linux.intel.com>
Cc: keir@xen.org, Ian.Campbell@citrix.com, stefano.stabellini@eu.citrix.com, George.Dunlap@eu.citrix.com, Ian.Jackson@eu.citrix.com, xen-devel@lists.xen.org, JBeulich@suse.com, dgdegra@tycho.nsa.gov
List-Id: xen-devel@lists.xenproject.org

On 06/10/14 15:24, Wei Liu wrote:
> On Mon, Oct 06, 2014 at 10:18:28PM +0800, Chao Peng wrote:
>> On Mon, Oct 06, 2014 at 02:55:02PM +0100, Wei Liu wrote:
>>> On Mon, Oct 06, 2014 at 09:32:53PM +0800, Chao Peng wrote:
>>>> On Fri, Oct 03, 2014 at 01:49:56PM +0100, Wei Liu wrote:
>>>>> Thanks for this quick turnaround.
>>>>>
>>>>> Overall this looks good to me. Just some more questions on one thing I
>>>>> don't understand.
>>>>>
>>>>> On Fri, Oct 03, 2014 at 07:55:15PM +0800, Chao Peng wrote:
>>>>> [...]
>>>>>> +int libxl__pick_random_socket_cpu(libxl__gc *gc, uint32_t socketid)
>>>>>> +{
>>>>> This name is clearer.
>>>>>
>>>>> But still, why randomization is required?
>>>>>
>>>>> Does this mean picking arbitrary CPU returns the same result to library
>>>>> user? If so, why randomization is required?
>>>> The background here is that the L3 cache info we want to get in this
>>>> patch serial is a per-socket resource. To get it, we need to run the
>>>> related RDMSR from a cpu in that socket. So our real purpose of this
>>>> routine is to pick up a cpu number in that socket. From function
>>>> perspective, any cpu in that socket should work.
>>>>
>>>> But for different domains we may have more than one
>>>> getting-l3-cache-info operations for a certain socket. We want to avoid
>>>> to run all these operations always on a same cpu every time. So the
>>>> randomization is used for load-balance among all the cpus in the same
>>>> socket.
>>>>
>>> I'm not sure how much we can get from this randomization. Are you
>>> implying this operation is quite heavy workload for a cpu and / or
>>> there's potentially hundreds or thousands of parallel operations
>>> executed at the same time? FWIW in order to get cpu topology you need to
>>> issue hypercall, which is quite expensive (perhaps not that expensive
>>> compared to the CMT operation itself?).
>> For CMT itself, I don't think we gain much benifit from this. But we
>> introduced a new generic resource_op hypercall which can be used for potential
>> heavy workload in the future. So we add this function in the tools side.
> If so, I would rather avoid doing pre-mature optimization. We can
> always add it in later when it's necessary.
>
> Wei.

The core randomisation per socket started when all of this was
implemented in Xen, and there would be repeated IPIs to core 0 on each
socket for the information.  At that point, it was far more likely to
repeatedly bounce the same VM in and out of non-root mode.

Now this is all in the toolstack, it is far less likely to happen, and
is fine to drop the optimisation.

~Andrew