From mboxrd@z Thu Jan  1 00:00:00 1970
From: Andrew Cooper <andrew.cooper3@citrix.com>
Subject: Re: [PATCH v6 4/7] x86: collect CQM information from
	all sockets
Date: Tue, 28 Jan 2014 14:34:32 +0000
Message-ID: <52E7BFF8.1090605@citrix.com>
References: <1386236334-15410-1-git-send-email-dongxiao.xu@intel.com>
	<1386236334-15410-5-git-send-email-dongxiao.xu@intel.com>
	<52DD38240200007800115102@nat28.tlf.novell.com>
	<40776A41FC278F40B59438AD47D147A9119237AD@SHSMSX104.ccr.corp.intel.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Return-path: <xen-devel-bounces@lists.xen.org>
In-Reply-To: <40776A41FC278F40B59438AD47D147A9119237AD@SHSMSX104.ccr.corp.intel.com>
List-Unsubscribe: <http://lists.xen.org/cgi-bin/mailman/options/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xen.org>
List-Help: <mailto:xen-devel-request@lists.xen.org?subject=help>
List-Subscribe: <http://lists.xen.org/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=subscribe>
Sender: xen-devel-bounces@lists.xen.org
Errors-To: xen-devel-bounces@lists.xen.org
To: "Xu, Dongxiao" <dongxiao.xu@intel.com>
Cc: "keir@xen.org" <keir@xen.org>, "Ian.Campbell@citrix.com" <Ian.Campbell@citrix.com>, "stefano.stabellini@eu.citrix.com" <stefano.stabellini@eu.citrix.com>, "dario.faggioli@citrix.com" <dario.faggioli@citrix.com>, "Ian.Jackson@eu.citrix.com" <Ian.Jackson@eu.citrix.com>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, Jan Beulich <JBeulich@suse.com>, "dgdegra@tycho.nsa.gov" <dgdegra@tycho.nsa.gov>
List-Id: xen-devel@lists.xenproject.org

On 28/01/14 14:23, Xu, Dongxiao wrote:
>
>>> +    case XEN_SYSCTL_getcqminfo:
>>> +    {
>>> +        struct xen_socket_cqmdata *info;
>>> +        uint32_t num_sockets;
>>> +        uint32_t num_rmids;
>>> +        cpumask_t cpu_cqmdata_map;
>> Unless absolutely avoidable, not CPU masks on the stack please.
> Okay, will allocate it by "xzalloc" function.

cpumask_var_t mask;
{z}alloc_cpumask_var(&mask);
...
free_cpumask_var(mask);

This will switch between a long on the stack and an allocated structure
depending on whether Xen is compiled with fewer or more than 64 pcpu.

>
>>> +
>>> +        if ( !system_supports_cqm() )
>>> +        {
>>> +            ret = -ENODEV;
>>> +            break;
>>> +        }
>>> +
>>> +        select_socket_cpu(&cpu_cqmdata_map);
>>> +
>>> +        num_sockets = min((unsigned
>> int)cpumask_weight(&cpu_cqmdata_map) + 1,
>>> +                          sysctl->u.getcqminfo.num_sockets);
>>> +        num_rmids = get_cqm_count();
>>> +        info = xzalloc_array(struct xen_socket_cqmdata,
>>> +                             num_rmids * num_sockets);
>> While unlikely right now, you ought to consider the case of this
>> multiplication overflowing.
>>
>> Also - how does the caller know how big the buffer needs to be?
>> Only num_sockets can be restricted by it...
>>
>> And what's worse - you allow the caller to limit num_sockets and
>> allocate info based on this limited value, but you don't restrict
>> cpu_cqmdata_map to just the socket covered, i.e. if the caller
>> specified a lower number, then you'll corrupt memory.
> Currently the caller (libxc) sets big num_rmid and num_sockets which are the maximum values that could be available inside the hypervisor.
> If you think this approach is not enough to ensure the security, what about the caller (libxc) issue an hypercall to get the two values from hypervisor, then allocating the same size of num_rmid and num_sockets?
>
>> And finally, I think the total size of the buffer here can easily
>> exceed a page, i.e. this then ends up being a non-order-0
>> allocation, which may _never_ succeed (i.e. the operation is
>> then rendered useless). I guest it'd be better to e.g. vmap()
>> the MFNs underlying the guest buffer.
> Do you mean we check the size of the total size, and allocate MFNs one by one, then vmap them?

I still think this is barking mad as a method of getting this quantity
of data from Xen to the toolstack in a repeated fashon.

Xen should allocate a per-socket buffer at the start of day (or perhaps
on first use of CQM), and the CQM monitoring tool gets to map those
per-socket buffers read-only.

This way, all processing of the CQM data happens in dom0 userspace, not
in Xen in hypercall context; All Xen has to do is periodically dump the
MSRs into the pages.

~Andrew