From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrew Cooper Subject: Re: [PATCH v5] x86/p2m: use large pages for MMIO mappings Date: Wed, 27 Jan 2016 14:28:17 +0000 Message-ID: <56A8D401.6080100@citrix.com> References: <56A25C0602000078000CA367@prv-mh.provo.novell.com> <1453724207.4320.137.camel@citrix.com> <56A6371802000078000CAA6B@prv-mh.provo.novell.com> <1453730752.4320.164.camel@citrix.com> <56A63C4002000078000CAAA7@prv-mh.provo.novell.com> <1453731704.4320.173.camel@citrix.com> <56A658FE02000078000CAC3D@prv-mh.provo.novell.com> <56A8B8C2.5010905@citrix.com> <56A8D61202000078000CB8EF@prv-mh.provo.novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta5.messagelabs.com ([195.245.231.135]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1aOR57-0007U8-4e for xen-devel@lists.xenproject.org; Wed, 27 Jan 2016 14:28:25 +0000 In-Reply-To: <56A8D61202000078000CB8EF@prv-mh.provo.novell.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Jan Beulich Cc: Kevin Tian , Wei Liu , Ian Campbell , Stefano Stabellini , George Dunlap , Tim Deegan , Ian Jackson , Jun Nakajima , xen-devel , Keir Fraser List-Id: xen-devel@lists.xenproject.org On 27/01/16 13:37, Jan Beulich wrote: >>>> On 27.01.16 at 13:32, wrote: >> On 25/01/16 16:18, Jan Beulich wrote: >>> --- a/xen/arch/x86/hvm/vmx/vmx.c >>> +++ b/xen/arch/x86/hvm/vmx/vmx.c >>> @@ -2491,7 +2491,7 @@ static int vmx_alloc_vlapic_mapping(stru >>> share_xen_page_with_guest(pg, d, XENSHARE_writable); >>> d->arch.hvm_domain.vmx.apic_access_mfn = mfn; >>> set_mmio_p2m_entry(d, paddr_to_pfn(APIC_DEFAULT_PHYS_BASE), _mfn(mfn), >>> - p2m_get_hostp2m(d)->default_access); >>> + PAGE_ORDER_4K, p2m_get_hostp2m(d)->default_access); >>> >> This should ASSERT() success, in case we make further changes to the >> error handling. > Maybe, but since it didn't before I don't see why this couldn't / > shouldn't be an independent future patch. Can be. IMO it is a bug that it isn't already checked. (-ENOMEM when allocating p2m leaves perhaps?) > >>> --- a/xen/arch/x86/mm/p2m.c >>> +++ b/xen/arch/x86/mm/p2m.c >>> @@ -899,48 +899,62 @@ void p2m_change_type_range(struct domain >>> p2m_unlock(p2m); >>> } >>> >>> -/* Returns: 0 for success, -errno for failure */ >>> +/* >>> + * Returns: >>> + * 0 for success >>> + * -errno for failure >>> + * order+1 for caller to retry with order (guaranteed smaller than >>> + * the order value passed in) >>> + */ >>> static int set_typed_p2m_entry(struct domain *d, unsigned long gfn, mfn_t mfn, >>> - p2m_type_t gfn_p2mt, p2m_access_t access) >>> + unsigned int order, p2m_type_t gfn_p2mt, >>> + p2m_access_t access) >>> { >>> int rc = 0; >>> p2m_access_t a; >>> p2m_type_t ot; >>> mfn_t omfn; >>> + unsigned int cur_order = 0; >>> struct p2m_domain *p2m = p2m_get_hostp2m(d); >>> >>> if ( !paging_mode_translate(d) ) >>> return -EIO; >>> >>> - gfn_lock(p2m, gfn, 0); >>> - omfn = p2m->get_entry(p2m, gfn, &ot, &a, 0, NULL, NULL); >>> + gfn_lock(p2m, gfn, order); >>> + omfn = p2m->get_entry(p2m, gfn, &ot, &a, 0, &cur_order, NULL); >>> + if ( cur_order < order ) >>> + { >>> + gfn_unlock(p2m, gfn, order); >>> + return cur_order + 1; >> Your comment states that the return value is guarenteed to be less than >> the passed-in order, but this is not the case here. cur_order could, in >> principle, be only 1 less than order, at which point your documentation >> is incorrect. >> >> Does this rely on the x86 architectural orders to function as documented? > No. Maybe the comment text is ambiguous, but I don't see how to > improve it without making it too lengthy: The return value is > +1, telling the caller to retry with , which is > guaranteed to be less than the order that got passed in. I.e. taking > the variable naming above, the caller would have to retry with > cur_order, which - due to the if() - is smaller than order. Ah - I see. The text is indeed confusing. How about: "1 + new order: for caller to retry with smaller order (guaranteed to be smaller than order passed in)" > >>> + } >>> if ( p2m_is_grant(ot) || p2m_is_foreign(ot) ) >>> { >>> - gfn_unlock(p2m, gfn, 0); >>> + gfn_unlock(p2m, gfn, order); >>> domain_crash(d); >>> return -ENOENT; >>> } >>> else if ( p2m_is_ram(ot) ) >>> { >>> + unsigned long i; >>> + >>> ASSERT(mfn_valid(omfn)); >> Shouldn't this check should be extended to the top of the order? > Well, yes, perhaps better to move it into ... > >>> - set_gpfn_from_mfn(mfn_x(omfn), INVALID_M2P_ENTRY); >>> + for ( i = 0; i < (1UL << order); ++i ) >>> + set_gpfn_from_mfn(mfn_x(omfn) + i, INVALID_M2P_ENTRY); > ... the body of the for(). But I'll wait with v6 until we settled on > the other aspects you raise. > >>> int set_mmio_p2m_entry(struct domain *d, unsigned long gfn, mfn_t mfn, >>> - p2m_access_t access) >>> + unsigned int order, p2m_access_t access) >>> { >>> - return set_typed_p2m_entry(d, gfn, mfn, p2m_mmio_direct, access); >>> + if ( order && >>> + rangeset_overlaps_range(mmio_ro_ranges, mfn_x(mfn), >>> + mfn_x(mfn) + (1UL << order) - 1) && >>> + !rangeset_contains_range(mmio_ro_ranges, mfn_x(mfn), >>> + mfn_x(mfn) + (1UL << order) - 1) ) >>> + return order; >> Should this not be a hard error? Even retrying with a lower order is >> going fail. > Why? The latest when order == 0, rangeset_overlaps_range() > will return the same as rangeset_contains_range(), and hence > the condition above will always be false (one of the two reasons > for checking order first here). It isn't the order check which is an issue. One way or another, if the original (mfn/order) fails the rangeset checks, the overall call is going to fail, but it will be re-executed repeatedly with an order decreasing to 0. Wouldn't it be better just to short-circuit this back&forth? Relatedly, is there actually anything wrong with making a superpage read-only mapping over some scattered read-only 4K pages? ~Andrew