From mboxrd@z Thu Jan  1 00:00:00 1970
From: Tomasz Wroblewski <tomasz.wroblewski@gmail.com>
Subject: Re: GPU passthrough performance regression in >4GB vms
 due to XSA-60 changes
Date: Mon, 19 May 2014 12:42:12 +0200
Message-ID: <5379E004.1080107@gmail.com>
References: <537484A9.9000001@gmail.com>
	<5374CFF80200007800012A53@mail.emea.novell.com>
	<5374AEBD.7090403@gmail.com>
	<5374DBFD0200007800012AEE@mail.emea.novell.com>
	<5374C389.507@gmail.com> <5374D08F.2050202@gmail.com>
	<5374D5B0.2080808@gmail.com>
	<537502710200007800012C7E@mail.emea.novell.com>
	<5374DFCA.10207@gmail.com>
	<5375CD4F0200007800012E27@mail.emea.novell.com>
	<5375F410.2060406@gmail.com>
	<537614F30200007800013139@mail.emea.novell.com>
	<53763E9B0200007800013260@mail.emea.novell.com>
	<5379DD1A.6050106@gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; Format="flowed"
Content-Transfer-Encoding: 7bit
Return-path: <xen-devel-bounces@lists.xen.org>
Received: from mail6.bemta4.messagelabs.com ([85.158.143.247])
	by lists.xen.org with esmtp (Exim 4.72)
	(envelope-from <tomasz.wroblewski@gmail.com>) id 1WmL1M-00034f-Ue
	for xen-devel@lists.xenproject.org; Mon, 19 May 2014 10:42:17 +0000
Received: by mail-we0-f182.google.com with SMTP id t60so5368873wes.27
	for <xen-devel@lists.xenproject.org>;
	Mon, 19 May 2014 03:42:15 -0700 (PDT)
In-Reply-To: <5379DD1A.6050106@gmail.com>
List-Unsubscribe: <http://lists.xen.org/cgi-bin/mailman/options/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xen.org>
List-Help: <mailto:xen-devel-request@lists.xen.org?subject=help>
List-Subscribe: <http://lists.xen.org/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=subscribe>
Sender: xen-devel-bounces@lists.xen.org
Errors-To: xen-devel-bounces@lists.xen.org
To: Jan Beulich <JBeulich@suse.com>
Cc: xen-devel@lists.xenproject.org
List-Id: xen-devel@lists.xenproject.org


On 05/19/2014 12:29 PM, Tomasz Wroblewski wrote:
>
> On 05/16/2014 04:36 PM, Jan Beulich wrote:
>>>>> On 16.05.14 at 13:38, <JBeulich@suse.com> wrote:
>>>>>> On 16.05.14 at 13:18, <tomasz.wroblewski@gmail.com> wrote:
>>>>> If I coded up a patch to deal with this on -unstable, would you be
>>>>> able to test that?
>>>> Willing to give it a go (xen major version updates are often 
>>>> problematic
>>>> to do though so can't promise success). What would your patch be 
>>>> doing?
>>>> Adding entries to MTRR for the relocated regions?
>>> This and properly declare the region in ACPI's _CRS. For starters I'll
>>> probably try keeping the WB default overlaid with UC variable ranges,
>>> as that's going to be the less intrusive change.
>> Okay here are two patches - the first to deal with the above mentioned
>> items, and the second to further increase correctness and at once
>> shrink the number of MTRR regions needed.
>>
>> Afaict they apply equally well to stable-4.3, master, and staging.
>>
>> But to be honest I don't expect any performance improvement, all
>> I'd expect is that BARs relocated above 4Gb would now get treated
>> equally to such below 4Gb - UC in all cases.
> Thanks Jan. I've tried the patches and you're correct, putting UC in 
> MTRR for the relocated region didn't help the issue. However, I had to 
> hack that manually - the codepaths to do that in your hvmloader patch 
> were not activating. The hvmloader is not programming guest pci bars 
> to 64bit regions at all, rather still programming them with 32 bit 
> regions... upon a look this seems because using_64bar conditon, as 
> well as bar64_relocate in hvmloader/pci.c is always false.
>
> So bar relocation to 64bit is not happening, but ram relocation as per 
> the code tagged as /* Relocate RAM that overlaps PCI space (in 
> 64k-page chunks). */ is happening. This maybe is correct (?), although 
> I think the fact that RAM is relocated but not the BAR causes the 
> tools (i.e. qemu) to lose sight of what memory is used for mmio and as 
> you mentioned in one of the previous posts, the calls which would set 
> it to mmio_direct in p2m table are not happening. Our qemu is pretty 
> ancient and doesn't support 64bit bars so its not super trivial to 
> verify whether relocating bars to 64bit would help. Trying to make 
> sense out of this..
>
>
Actually seems to be like the plausible explanation for the performance 
issues we see could be that

- some region of guest space has been relocated by hvmloader out to 
64bit memory to enlarge pci mmio hole (which stays in 32bit space)
- BUT the caching on that relocated region is UC since at the time of 
the relocation MTRR was disabled and that caused the EPT entry to get UC 
type.
- however since this is just some region of guest memory not actually 
used for mmio, just relocated out of mmio hole, the caching should be WB
- guest doesn't use that region for mmio but for some other tasks, 
access to that is slow and slows the guest down.
- as you mentioned it might already be fixed on unstable since EPTs are 
updated there when mtrr is enabled.

That would explain why retaining old loop fixed by XSA-60 fixes the perf 
issue, since it runs at the time mtrr is enabled, it reverts the 
relocated region to WB (which is correct I guess for the non mmio-regions)


>
>
>