From mboxrd@z Thu Jan  1 00:00:00 1970
From: Andrew Cooper <andrew.cooper3@citrix.com>
Subject: Re: [PATCH RFC v2 1/4] x86/mm: Shadow and p2m changes
	for PV mem_access
Date: Fri, 22 Aug 2014 21:02:11 +0100
Message-ID: <53F7A1C3.3060007@citrix.com>
References: <1404787805-4540-1-git-send-email-aravindp@cisco.com>
	<53D221270200007800025CEA@mail.emea.novell.com>
	<97A500D504438F4ABC02EBA81613CC633188BE3A@xmb-aln-x02.cisco.com>
	<53D60EA002000078000265FC@mail.emea.novell.com>
	<97A500D504438F4ABC02EBA81613CC633188EC46@xmb-aln-x02.cisco.com>
	<53D8B6AC0200007800027844@mail.emea.novell.com>
	<97A500D504438F4ABC02EBA81613CC633188EF44@xmb-aln-x02.cisco.com>
	<53DB52380200007800028448@mail.emea.novell.com>
	<97A500D504438F4ABC02EBA81613CC633188FBD7@xmb-aln-x02.cisco.com>
	<53DF4C6B0200007800028CA5@mail.emea.novell.com>
	<97A500D504438F4ABC02EBA81613CC63318A2DBD@xmb-rcd-x02.cisco.com>
	<53E096C902000078000294A2@mail.emea.novell.com>
	<97A500D504438F4ABC02EBA81613CC63318E00F6@xmb-aln-x02.cisco.com>
	<53F77EEC020000780002CC3E@mail.emea.novell.com>
	<97A500D504438F4ABC02EBA81613CC63318E04E7@xmb-aln-x02.cisco.com>
	<53F798D9.9090800@citrix.com>
	<97A500D504438F4ABC02EBA81613CC63318E057E@xmb-aln-x02.cisco.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Return-path: <xen-devel-bounces@lists.xen.org>
Received: from mail6.bemta4.messagelabs.com ([85.158.143.247])
	by lists.xen.org with esmtp (Exim 4.72)
	(envelope-from <Andrew.Cooper3@citrix.com>) id 1XKv2V-0000mp-Ki
	for xen-devel@lists.xenproject.org; Fri, 22 Aug 2014 20:02:23 +0000
In-Reply-To: <97A500D504438F4ABC02EBA81613CC63318E057E@xmb-aln-x02.cisco.com>
List-Unsubscribe: <http://lists.xen.org/cgi-bin/mailman/options/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xen.org>
List-Help: <mailto:xen-devel-request@lists.xen.org?subject=help>
List-Subscribe: <http://lists.xen.org/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=subscribe>
Sender: xen-devel-bounces@lists.xen.org
Errors-To: xen-devel-bounces@lists.xen.org
To: "Aravindh Puthiyaparambil (aravindp)" <aravindp@cisco.com>, Jan Beulich <JBeulich@suse.com>
Cc: "xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>, Keir Fraser <keir@xen.org>, Ian Jackson <ian.jackson@eu.citrix.com>, Ian Campbell <ian.campbell@citrix.com>, Tim Deegan <tim@xen.org>
List-Id: xen-devel@lists.xenproject.org

On 22/08/14 20:48, Aravindh Puthiyaparambil (aravindp) wrote:
>>>> As Andrew already pointed out, you absolutely need to deal with page
>>>> crossing accesses,
>>> Is this for say an unsigned long that lives across two pages? Off the top of
>> my head, I think always allowing writes to the page in question and the next
>> followed by reverting to default for both pages at the end of the write should
>> take care of this. I would have to walk the page tables to figure out the next
>> mfn. Or am I on the wrong track here?
>>
>> create_bounce_frame puts several adjacent words on a guest stack, and this
>> is very capable of crossing a page boundary.
>>
>> Even an unaligned uint16_t can cross a page boundary.
> OK, so marking two adjacent pages as writable and reverting after the write went through should solve this problem.
>
>>>> and I think you also need to deal with hypervisor accesses extending
>>>> beyond a page worth of memory (I'm not sure we have a firmly
>>>> determined upper bound of how much memory we may copy in one go).
>>> Let me try to understand what happens in the non-mem_access case. Say
>> the hypervisor is writing to three pages and all of them are not accessible in
>> the guest. Which one of the following is true?
>>> 1. There is a pagefault for the first page which is resolved. The write is then
>> retried which causes a fault for the second page which is resolved. Then the
>> write is retried starting from the second page and so on for the third page too.
>>> 2. Or does the write get retried starting from the first page each time the
>> page fault is resolved?
>>
>> For the non-mem_access case, all faults cause failures.
>>
>> copy_to/from_user() will typically result in an -EFAULT being handed back to
>> the hypercaller.  For create_bounce_frame, the results are more severe and
>> might result in a domain crash or an injection of a failsafe callback.
>>
>> No attempt is made to play with the page permissions, as it is the guests fault
>> that the pages have the wrong permissions.
>>
>> What mem_access introduces is a case where it is Xen's fault that a write fault
>> occured, and the fault should be worked around as the guest is unaware that
>> its pages are actually read-only.
> Ouch, this does make things complicated. The only thing I can think of trying is your suggestion "Alternatively, locate the page in question and use map_domain_page() to get a supervisor rw mapping.". Do this only in __copy_to_user_ll() for copies that span multiple pages in the cases where a mem_access listener is present and listening for write violations. 
>
> Sigh, if only I could bound the CR0.WP solution :-(

I wonder whether, in the case that mem_access is enabled, it would be
reasonable to perform the CR0.WP sections with interrupts disabled?

The user/admin is already taking a performance hit from mem_access in
the first place, and delaying hardware interrupts a little almost
certainly a lesser evil than whatever mad scheme we devise to fix these
issues.

With interrupts disabled, the CR0.WP problem is very well bounded.  The
only faults which will occur will be as a direct result of the actions
performed, where the fault handlers will follow the extable redirection
and return quickly.

~Andrew