From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrew Cooper Subject: Re: [PATCH RFC v2 1/4] x86/mm: Shadow and p2m changes for PV mem_access Date: Fri, 22 Aug 2014 10:34:17 +0100 Message-ID: <53F70E99.7030501@citrix.com> References: <1404787805-4540-1-git-send-email-aravindp@cisco.com> <1404787805-4540-2-git-send-email-aravindp@cisco.com> <53D13457020000780002599B@mail.emea.novell.com> <97A500D504438F4ABC02EBA81613CC633188B795@xmb-aln-x02.cisco.com> <53D221270200007800025CEA@mail.emea.novell.com> <97A500D504438F4ABC02EBA81613CC633188BE3A@xmb-aln-x02.cisco.com> <53D60EA002000078000265FC@mail.emea.novell.com> <97A500D504438F4ABC02EBA81613CC633188EC46@xmb-aln-x02.cisco.com> <53D8B6AC0200007800027844@mail.emea.novell.com> <97A500D504438F4ABC02EBA81613CC633188EF44@xmb-aln-x02.cisco.com> <53DB52380200007800028448@mail.emea.novell.com> <97A500D504438F4ABC02EBA81613CC633188FBD7@xmb-aln-x02.cisco.com> <53DF4C6B0200007800028CA5@mail.emea.novell.com> <97A500D504438F4ABC02EBA81613CC63318A2DBD@xmb-rcd-x02.cisco.com> <5 3E096C902000078000294A2@mail.emea.novell.com> <97A500D504438F4ABC02EBA81613CC63318E00F6@xmb-aln-x02.cisco.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta5.messagelabs.com ([195.245.231.135]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1XKlEl-0006OC-Jx for xen-devel@lists.xenproject.org; Fri, 22 Aug 2014 09:34:23 +0000 In-Reply-To: <97A500D504438F4ABC02EBA81613CC63318E00F6@xmb-aln-x02.cisco.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: "Aravindh Puthiyaparambil (aravindp)" , Jan Beulich Cc: "xen-devel@lists.xenproject.org" , Tim Deegan , Keir Fraser , Ian Jackson , Ian Campbell List-Id: xen-devel@lists.xenproject.org On 22/08/14 03:29, Aravindh Puthiyaparambil (aravindp) wrote: >>> No, at least not about unspecified hypothetical ones. But again - a >>> vague statement like you gave, without any kind of quantification of >>> the imposed overhead, isn't going to be good enough a judgment. >>> After all pausing a domain can be quite problematic for its performance >>> if that happens reasonably frequently. Otoh I admit that the user of >>> your new mechanism has a certain level of control over the impact via >>> the number of pages (s)he wants to write-protect. So yes, perhaps it >>> isn't going to be too bad as long as the hackery you need to do isn't. >> I just wanted to give an update as to where I stand so as to not leave this >> thread hanging. As I was working through pausing and unpausing the domain >> during Xen writes to guest memory, I found a spot where the write happens >> outside of __copy_to_user_ll() and __put_user_size(). It occurs in >> create_bounce_frame() where Xen writes to the guest stack to setup an >> exception frame. At that moment, if the guest stack is marked read-only, we >> end up in the same situation as with the copy to guest functions but with no >> obvious place to revert the page table entry back to read-only. I am at the >> moment looking for a spot where I can do this. > I have a solution for the create_bounc_frame() issue I described above. Please find below a POC patch that includes pausing and unpausing the domain during the Xen writes to guest memory. I have it on top of the patch that was using CR0.WP to highlight the difference. Please take a look and let me know if this solution is acceptable. > > PS: I do realize whatever I do to create_bounce_frame() will have to be reflected in the compat version. If this is correct approach I will do the same there too. > > Thanks, > Aravindh What is wrong with just making use of CR0.WP to solve this issue? Alternatively, locate the page in question and use map_domain_page() to get a supervisor rw mapping. I am concerned with the addition of a the vcpu specifics to shadow_write_entries(). Most of the shadow code is already vcpu centric where it should be domain centric, and steps are being made to alleviate these problems. Any access in from a toolstack/device model hypercall will probably be using vcpu[0], which will cause this logic to be applied in an erroneous context. ~Andrew