From mboxrd@z Thu Jan 1 00:00:00 1970 From: George Dunlap Subject: Re: [PATCH v7] x86/p2m: use large pages for MMIO mappings Date: Tue, 9 Feb 2016 12:17:08 +0000 Message-ID: <56B9D8C4.2070706@citrix.com> References: <56B0D61302000078000CD962@prv-mh.provo.novell.com> <56B8D8C7.3050307@citrix.com> <56B9B47302000078000CFE4F@prv-mh.provo.novell.com> <56B9C5ED.4010403@citrix.com> <56B9E01502000078000CFFDE@prv-mh.provo.novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta14.messagelabs.com ([193.109.254.103]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1aT7EL-0003I4-PD for xen-devel@lists.xenproject.org; Tue, 09 Feb 2016 12:17:17 +0000 In-Reply-To: <56B9E01502000078000CFFDE@prv-mh.provo.novell.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Jan Beulich Cc: Kevin Tian , Wei Liu , Ian Campbell , Stefano Stabellini , George Dunlap , Andrew Cooper , Ian Jackson , TimDeegan , Jun Nakajima , xen-devel , Keir Fraser List-Id: xen-devel@lists.xenproject.org On 09/02/16 11:48, Jan Beulich wrote: >>>> On 09.02.16 at 11:56, wrote: >> On 09/02/16 08:42, Jan Beulich wrote: >>>>>> On 08.02.16 at 19:04, wrote: >>>> First -- and this isn't a blocker, but just a question (and sorry if >>>> it's been answered before) -- why have the "0 means I did it all, >>> means I did it partially"? Why not just return the number of entries >>>> actually handled? >>> >>> Because I view zero as the conventional "success" indicator. No >>> other deeper reason. >> >> For certain classes of functionality, yes. But for other classes -- >> where the caller will request N but the OS / hypervisor / whatever may >> do some other number fewer than N, then the convention is to always pass >> the number of items completed. read() and write() system calls are like >> this, as are most of the XENMEM operations (e.g., >> XENMEM_increase_resevration). >> >> Since this domctl looks a lot like some of those XENMEM operations, >> wouldn't it make more sense to follow that convention, and return the >> number of extents actually allocated? > > Well, if comparing with the XENMEM ones, perhaps. But if comparing > with other domctls, perhaps rather not? I've really been undecided > here from the very beginning, since I had considered that alternative > right away. Well from a brief survey of things that can partially succeed (getmemlist, gethvmcontext, {get,set}_vcpu_msrs), it seems that most of them: 1. Use the return value *only* for success or failure; they either return 0 or -errno, even if they only partially succeed 2. Return the actual number of things accomplished in the struct itself. You seem to be introducing a new model, where you use the return value sort of for all three. (Are there any other hypercalls that behave this way?) I don't think sometimes returning the number of things you did and sometimes returning zero makes any sense. My suggestion would be either make "nr_mfns" bidirectional (as similar fields are in the other domctls) and return 0 on either full or partial success, or just return the number of mfns actually mapped either on full or partial success. >> Under those conditions, it looks to me like the following will happen: >> 1. map_mmio_regions() will call set_mmio_p2m_entry() with order 9 >> 2. set_mmio_p2m_entry() will return 9 (requesting using order 8) >> 3. map_mmio_regions() will call set_mmio_p2m_entry() with order 8 >> 4. set_mmio_p2m_entry() will call set_typed_p2m_entry() with order 8 >> 5. set_typed_p2m_entry() will read the current p2m entry of >= 9 and >> continue (since 8 <= 9) >> 6. set_typed_p2m_entry() will call p2m_set_entry() with an order of 8 >> 7. ept_set_entry() will return -EINVAL, since order % EPT_TABLE_ORDER != 0. >> >> Am I missing something? > > Between step 6 and step 7 there is actually p2m_set_entry() > breaking up the request into chunks or orders the hardware > supports. Oh, right -- sorry, I assumed that p2m_set_entry() was just an alias for the underlying *set_entry() function. I see now there's a loop handling arbitrary orders. Thanks. :-) -George