From mboxrd@z Thu Jan  1 00:00:00 1970
From: Malcolm Crossley <malcolm.crossley@citrix.com>
Subject: Re: [PATCH] xen/domctl: lower loglevel of
	XEN_DOMCTL_memory_mapping
Date: Wed, 9 Sep 2015 16:19:54 +0100
Message-ID: <55F04E1A.6070202@citrix.com>
References: <1441781425-11553-1-git-send-email-tiejun.chen@intel.com>
	<20150909142018.GA28134@l.oracle.com>
	<55F05F7002000078000A15B9@prv-mh.provo.novell.com>
	<20150909145013.GH28134@l.oracle.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Return-path: <xen-devel-bounces@lists.xen.org>
In-Reply-To: <20150909145013.GH28134@l.oracle.com>
List-Unsubscribe: <http://lists.xen.org/cgi-bin/mailman/options/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xen.org>
List-Help: <mailto:xen-devel-request@lists.xen.org?subject=help>
List-Subscribe: <http://lists.xen.org/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=subscribe>
Sender: xen-devel-bounces@lists.xen.org
Errors-To: xen-devel-bounces@lists.xen.org
To: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>, Jan Beulich <JBeulich@suse.com>
Cc: Keir Fraser <keir@xen.org>, Ian Campbell <ian.campbell@citrix.com>, Tim Deegan <tim@xen.org>, Ian Jackson <ian.jackson@eu.citrix.com>, xen-devel@lists.xen.org, Tiejun Chen <tiejun.chen@intel.com>
List-Id: xen-devel@lists.xenproject.org

On 09/09/15 15:50, Konrad Rzeszutek Wilk wrote:
> On Wed, Sep 09, 2015 at 08:33:52AM -0600, Jan Beulich wrote:
>>>>> On 09.09.15 at 16:20, <konrad.wilk@oracle.com> wrote:
>>> Perhaps the solution is remove the first printk(s) and just have them
>>> once the operation has completed? That may fix the outstanding tasklet
>>> problem?
>>
>> Considering that this is a tool stack based retry, how would the
>> hypervisor know when the _whole_ operation is done?
> 
> I was merely thinking of moving the printk _after_ the map_mmio_regions
> so there wouldn't be any outstanding preemption points in map_mmio_regions
> (so it can at least do the 64 PFNs).
> 
> But going forward a couple of ideas:
> 
>  - The 64 limit was arbitrary. It could have been 42 or PFNs / num_online_cpus(),
>    or actually finding out the size of the BAR and figuring the optimal
>    case so that it will be done under 1ms. Or perhaps just provide an
>    boot time parameter for those that really are struggling with this.

The issue of it taking a long time to map a large BAR is caused by the unconditional
memory_type_changed call at the bottom of XEN_DOMCTL_memory_mapping function.

The memory_type_changed call results in a full data cache flush on VM with a PCI device
assigned.

We have seen this overhead cause a 1GB BAR to take 20 seconds to map it's MMIO.

If the 64 limit was arbitrary then I would suggest increasing it to at least 1024 so that
at least 4M of BAR can be mapped in one go and it reduces the overhead by a factor of 16.

Currently the toolstack attempts lower and lower powers of 2 size's of the MMIO region to be mapped
until the hypercall succeeds.

Is it not clear to me why we have an unconditional call to memory_type_changed in the domctl.
Can somebody explain why it can't be made condition on errors?

Malcolm


> 
>  - Perhaps add a new API to the P2M code so it can do the operations
>    in batches. Our map_mmio_region iterates over every PFN which is
>    surely not efficient. Some batching could help? Maybe? What other
>    code could benefit from this? Would the boot-time creation of IOMMU
>    page-tables also help with this (which takes 10 minutes on a 1TB box
>    BTW)
> 
>  - This printk can be altogether removed in the hypervisor and moved
>    in the toolstack. That is the libxl xc_domain_add_to_physmap
>    could itself use the logging facility (xc_report?) the action.
>    
>>
>> Jan
>>
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
>