From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrew Cooper Subject: Re: [PATCH RFC v2 1/4] x86/mm: Shadow and p2m changes for PV mem_access Date: Fri, 22 Aug 2014 21:02:11 +0100 Message-ID: <53F7A1C3.3060007@citrix.com> References: <1404787805-4540-1-git-send-email-aravindp@cisco.com> <53D221270200007800025CEA@mail.emea.novell.com> <97A500D504438F4ABC02EBA81613CC633188BE3A@xmb-aln-x02.cisco.com> <53D60EA002000078000265FC@mail.emea.novell.com> <97A500D504438F4ABC02EBA81613CC633188EC46@xmb-aln-x02.cisco.com> <53D8B6AC0200007800027844@mail.emea.novell.com> <97A500D504438F4ABC02EBA81613CC633188EF44@xmb-aln-x02.cisco.com> <53DB52380200007800028448@mail.emea.novell.com> <97A500D504438F4ABC02EBA81613CC633188FBD7@xmb-aln-x02.cisco.com> <53DF4C6B0200007800028CA5@mail.emea.novell.com> <97A500D504438F4ABC02EBA81613CC63318A2DBD@xmb-rcd-x02.cisco.com> <53E096C902000078000294A2@mail.emea.novell.com> <97A500D504438F4ABC02EBA81613CC63318E00F6@xmb-aln-x02.cisco.com> <53F77EEC020000780002CC3E@mail.emea.novell.com> <97A500D504438F4ABC02EBA81613CC63318E04E7@xmb-aln-x02.cisco.com> <53F798D9.9090800@citrix.com> <97A500D504438F4ABC02EBA81613CC63318E057E@xmb-aln-x02.cisco.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta4.messagelabs.com ([85.158.143.247]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1XKv2V-0000mp-Ki for xen-devel@lists.xenproject.org; Fri, 22 Aug 2014 20:02:23 +0000 In-Reply-To: <97A500D504438F4ABC02EBA81613CC63318E057E@xmb-aln-x02.cisco.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: "Aravindh Puthiyaparambil (aravindp)" , Jan Beulich Cc: "xen-devel@lists.xenproject.org" , Keir Fraser , Ian Jackson , Ian Campbell , Tim Deegan List-Id: xen-devel@lists.xenproject.org On 22/08/14 20:48, Aravindh Puthiyaparambil (aravindp) wrote: >>>> As Andrew already pointed out, you absolutely need to deal with page >>>> crossing accesses, >>> Is this for say an unsigned long that lives across two pages? Off the top of >> my head, I think always allowing writes to the page in question and the next >> followed by reverting to default for both pages at the end of the write should >> take care of this. I would have to walk the page tables to figure out the next >> mfn. Or am I on the wrong track here? >> >> create_bounce_frame puts several adjacent words on a guest stack, and this >> is very capable of crossing a page boundary. >> >> Even an unaligned uint16_t can cross a page boundary. > OK, so marking two adjacent pages as writable and reverting after the write went through should solve this problem. > >>>> and I think you also need to deal with hypervisor accesses extending >>>> beyond a page worth of memory (I'm not sure we have a firmly >>>> determined upper bound of how much memory we may copy in one go). >>> Let me try to understand what happens in the non-mem_access case. Say >> the hypervisor is writing to three pages and all of them are not accessible in >> the guest. Which one of the following is true? >>> 1. There is a pagefault for the first page which is resolved. The write is then >> retried which causes a fault for the second page which is resolved. Then the >> write is retried starting from the second page and so on for the third page too. >>> 2. Or does the write get retried starting from the first page each time the >> page fault is resolved? >> >> For the non-mem_access case, all faults cause failures. >> >> copy_to/from_user() will typically result in an -EFAULT being handed back to >> the hypercaller. For create_bounce_frame, the results are more severe and >> might result in a domain crash or an injection of a failsafe callback. >> >> No attempt is made to play with the page permissions, as it is the guests fault >> that the pages have the wrong permissions. >> >> What mem_access introduces is a case where it is Xen's fault that a write fault >> occured, and the fault should be worked around as the guest is unaware that >> its pages are actually read-only. > Ouch, this does make things complicated. The only thing I can think of trying is your suggestion "Alternatively, locate the page in question and use map_domain_page() to get a supervisor rw mapping.". Do this only in __copy_to_user_ll() for copies that span multiple pages in the cases where a mem_access listener is present and listening for write violations. > > Sigh, if only I could bound the CR0.WP solution :-( I wonder whether, in the case that mem_access is enabled, it would be reasonable to perform the CR0.WP sections with interrupts disabled? The user/admin is already taking a performance hit from mem_access in the first place, and delaying hardware interrupts a little almost certainly a lesser evil than whatever mad scheme we devise to fix these issues. With interrupts disabled, the CR0.WP problem is very well bounded. The only faults which will occur will be as a direct result of the actions performed, where the fault handlers will follow the extable redirection and return quickly. ~Andrew