xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* GPU passthrough performance regression in >4GB vms due to XSA-60 changes
@ 2014-05-15  9:11 Tomasz Wroblewski
  2014-05-15 12:32 ` Jan Beulich
  0 siblings, 1 reply; 31+ messages in thread
From: Tomasz Wroblewski @ 2014-05-15  9:11 UTC (permalink / raw)
  To: xen-devel; +Cc: jinsong.liu, JBeulich

Hello,

We've recently updated from Xen 4.3.1 to 4.3.2 and found out a major 
regression in gpu passthrough performance in VMs using >4GB of memory. 
When using GPU pt (some radeon cards, also intergrated intel gpu pt), 
load on cpu is constantly near maximum and screen is slow to update. The 
machines are intel haswell/ivybridge laptops/desktops, the guests are 
windows 7 64-bit HVMs.

I've bisected the failure to be due to XSA-60 changes, specifically:

commit e81d0ac25464825b3828cff5dc9e8285612992c4
Author: Liu Jinsong <jinsong.liu@intel.com>
Date:   Mon Dec 9 14:26:03 2013 +0100

     VMX: remove the problematic set_uc_mode logic

This commit seems to have removed a bit of logic which, when guest was 
setting cache disable bit in CR0 for a brief time, was iterating on all 
mapped pfns and resetting memory type in EPTs to be consistent with the 
result of mtrr.c:epte_get_entry_emt() call. I believe my tracing 
indicates this used to return WRITEBACK caching strategy for the 64bit 
memory areas where the BARs of the gpu seem to be located.

This bit of code is not happening anymore, speculatively I think the PCI 
BAR area stays as uncached which causes the general slowness. Note that 
I'm not talking about slow performance during the window the CR0 has 
caching disabled, it does stays slow even after guest reenables it 
shortly after since the problem seems to be a side effect of removed 
loop setting some default EPT policies on all pfns. Reintroducing the 
removed loop fixes the problem.

Would welcome comments/ideas how to debug this more, or maybe there's an 
obvious fix.

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2014-05-20  6:31 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-05-15  9:11 GPU passthrough performance regression in >4GB vms due to XSA-60 changes Tomasz Wroblewski
2014-05-15 12:32 ` Jan Beulich
2014-05-15 12:10   ` Tomasz Wroblewski
2014-05-15 13:23     ` Jan Beulich
2014-05-15 13:39       ` Tomasz Wroblewski
2014-05-15 14:34         ` Tomasz Wroblewski
2014-05-15 14:56           ` Tomasz Wroblewski
2014-05-15 16:07             ` Jan Beulich
2014-05-15 15:39               ` Tomasz Wroblewski
2014-05-16  6:33                 ` Jan Beulich
2014-05-16 11:18                   ` Tomasz Wroblewski
2014-05-16 11:38                     ` Jan Beulich
2014-05-16 14:36                       ` Jan Beulich
2014-05-19 10:29                         ` Tomasz Wroblewski
2014-05-19 10:38                           ` Jan Beulich
2014-05-19 10:47                             ` Tomasz Wroblewski
2014-05-19 11:07                               ` Jan Beulich
2014-05-19 11:32                                 ` Tomasz Wroblewski
2014-05-19 12:06                                   ` Jan Beulich
2014-05-19 12:17                                     ` Tomasz Wroblewski
2014-05-19 12:44                                       ` Jan Beulich
2014-05-19 14:20                                         ` Tomasz Wroblewski
2014-05-19 15:24                                           ` Jan Beulich
2014-05-19 15:48                                             ` Tomasz Wroblewski
2014-05-19 17:36                                             ` Tim Deegan
2014-05-20  6:31                                               ` Jan Beulich
2014-05-19 10:42                           ` Tomasz Wroblewski
2014-05-19 11:01                             ` Jan Beulich
2014-05-19 11:09                               ` Tomasz Wroblewski
2014-05-19 11:19                                 ` Jan Beulich
2014-05-15 16:01         ` Jan Beulich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).