From mboxrd@z Thu Jan 1 00:00:00 1970 From: Mats Petersson Subject: Re: [PATCH V5] xen/privcmd.c: improve performance of MMAPBATCH_V2. Date: Wed, 2 Jan 2013 16:45:46 +0000 Message-ID: <50E4643A.2090003@citrix.com> References: <1355943187-4167-1-git-send-email-mats.petersson@citrix.com> <1356000020.26722.39.camel@zakaz.uk.xensource.com> <50D2EFCE.8070706@citrix.com> <1356003167.26722.55.camel@zakaz.uk.xensource.com> <716BE5EF-DAAC-47DE-A4FF-96D67219E4B3@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <716BE5EF-DAAC-47DE-A4FF-96D67219E4B3@gmail.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Andres Lagar-Cavilla Cc: "xen-devel@lists.xensource.com" , Ian Campbell , "JBeulich@suse.com" , David Vrabel , "mats@planetcatfish.com" , "konrad@darnok.org" , Andres Lagar-Cavilla List-Id: xen-devel@lists.xenproject.org On 02/01/13 15:47, Andres Lagar-Cavilla wrote: > Hi, > On Dec 20, 2012, at 6:32 AM, Ian Campbell wrote: > >> On Thu, 2012-12-20 at 11:00 +0000, Mats Petersson wrote: >> >>>>> - err = 0; >>>>> + err = last_err; >>>> This means that if you have 100 failures followed by one success you >>>> return success overall. Is that intentional? That doesn't seem right. >>> As far as I see, it doesn't mean that. last_err is only set at the >>> beginning of the call (to zero) and if there is an error. >> Ah, yes I missed that (but it still isn't right, see below). >> >>>>> diff --git a/drivers/xen/privcmd.c b/drivers/xen/privcmd.c >>>>> index 0bbbccb..8f86a44 100644 >>>>> --- a/drivers/xen/privcmd.c >>>>> +++ b/drivers/xen/privcmd.c >>>> [...] >>>>> @@ -283,12 +317,10 @@ static int mmap_batch_fn(void *data, void *state) >>>>> if (xen_feature(XENFEAT_auto_translated_physmap)) >>>>> cur_page = pages[st->index++]; >>>>> >>>>> - ret = xen_remap_domain_mfn_range(st->vma, st->va & PAGE_MASK, *mfnp, 1, >>>>> - st->vma->vm_page_prot, st->domain, >>>>> - &cur_page); >>>>> - >>>>> - /* Store error code for second pass. */ >>>>> - *(st->err++) = ret; >>>>> + BUG_ON(nr < 0); >>>>> + ret = xen_remap_domain_mfn_array(st->vma, st->va & PAGE_MASK, mfnp, nr, >>>>> + st->err, st->vma->vm_page_prot, >>>>> + st->domain, &cur_page); >>>>> >>>>> /* And see if it affects the global_error. */ >>>> The code which follows needs adjustment to cope with the fact that we >>>> now batch. I think it needs to walk st->err and set global_error as >>>> appropriate. This is related to my comment about the return value of >>>> xen_remap_domain_mfn_range too. >>> The return value, as commented above, is "either 0 or the last error we >>> saw". There should be no need to walk the st->err, as we know if there >>> was some error or not. >> The code which follows tries to latch having seen ENOENT in preference >> to other errors, so you don't need 0 or the last error, you need either >> 0, ENOENT if one occurred somewhere in the batch or an error!=ENOENT. >> (or maybe its vice versa, either way the last error isn't necessarily >> what you need). > Yeah the error reporting interface sucks. All I can say in my defense is that it was that way when I got here. > > FWIW EFAULT *takes* precedence over ENOENT. > /* If we have not had any EFAULT-like global errors then set the global > * error to -ENOENT if necessary. */ > if ((ret == 0) && (state.global_error == -ENOENT)) > ret = -ENOENT; > > Otherwise, any individual mapping that fails differently from ENOENT does not affect the global error. That is, in absence of global errors like EFAULT, and in absence of per-mapping ENOENT return codes, per mapping errors like EINVAL do not affect the ioctl return code which remains zero. > > IIUC Ian has been pointing that your code breaks this last statement. Sorry, are you saying the correct behaviour is to return 0 if we have EINVAL or some other error [aside from EFAULT and ENOENT]? Or are you saying that my code is broken such that it does this? I did some fixes, that I believe "does the right thing", before Christmas, but since I knew most people wouldn't be available, I didn't post it. I will try to get round to it before the end of the week. -- Mats > > Thanks > Andres >>>> I think rather than trying to fabricate some sort of meaningful error >>>> code for an entire batch xen_remap_domain_mfn_range should just return >>>> an indication about whether there were any errors or not and leave it to >>>> the caller to figure out the overall result by looking at the err array. >>>> >>>> Perhaps return either the number of errors or the number of successes >>>> (which turns the following if into either (ret) or (ret < nr) >>>> respectively). >>> I'm trying to not change how the code above expects things to work. >>> Whilst it would be lovely to rewrite the entire code dealing with >>> mapping memory, I don't think that's within the scope of my current >>> project. And if I don't wish to rewrite all of libxc's memory management >> I'm not talking about changing the libxc or ioctl interface, only about >> the internal-to-Linux interface between privcmd and mmu.c. In fact I'm >> only actually asking you to change the return value of the new function >> you are adding to the API. >> >>> code, I don't want to alter what values are returned or when. The >>> current code follows what WAS happening before my changes - which isn't >>> exactly the most fantastic thing, and I think there may actually be bugs >>> in there, such as: >>> if (ret < 0) { >>> if (ret == -ENOENT) >>> st->global_error = -ENOENT; >>> else { >>> /* Record that at least one error has happened. */ >>> if (st->global_error == 0) >>> st->global_error = 1; >>> } >>> } >>> if we enter this once with -EFAULT, and then after that with -ENOENT, >>> global_error will say -ENOENT. I think knowing that we got an EFAULT is >>> "higher importance" than ENOENT, but that's how the old code was >>> working, and I'm not sure I should change it at this point. >> But I think you are changing it, by this behaviour or returning the last >> error in the batch which causes this code to behave differently. That's >> what I'm asking you to avoid! >> >>>>> if (ret < 0) { >>>>> @@ -300,7 +332,7 @@ static int mmap_batch_fn(void *data, void *state) >>>>> st->global_error = 1; >>>>> } >>>>> } >>>>> - st->va += PAGE_SIZE; >>>>> + st->va += PAGE_SIZE * nr; >>>>> >>>>> return 0; >>>>> } >>>>> @@ -430,8 +462,8 @@ static long privcmd_ioctl_mmap_batch(void __user *udata, int version) >>>>> state.err = err_array; >>>>> >>>>> /* mmap_batch_fn guarantees ret == 0 */ >>>>> - BUG_ON(traverse_pages(m.num, sizeof(xen_pfn_t), >>>>> - &pagelist, mmap_batch_fn, &state)); >>>>> + BUG_ON(traverse_pages_block(m.num, sizeof(xen_pfn_t), >>>>> + &pagelist, mmap_batch_fn, &state)); >>>> Can we make traverse_pages and _block common by passing a block size >>>> parameter? >>> Yes of course. Is there much benefit from that? I understand that it's >>> less code, but it also makes the original traverse_pages more complex. >>> Not convinced it helps much - it's quite a small function, so not much >>> extra code. Additionally, all of the callback functions will have to >>> deal with an extra parameter (that is probably ignored in all but one >>> place). >> It depends on what the combined version ends up looking like but in >> general I'm in favour of one more generic function over several cloned >> and hacked versions of the same thing. >> >> Ian. >>