xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Mats Petersson <mats.petersson@citrix.com>
To: Andres Lagar-Cavilla <andres.lagarcavilla@gmail.com>
Cc: "xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>,
	Ian Campbell <Ian.Campbell@citrix.com>,
	"JBeulich@suse.com" <JBeulich@suse.com>,
	David Vrabel <david.vrabel@citrix.com>,
	"mats@planetcatfish.com" <mats@planetcatfish.com>,
	"konrad@darnok.org" <konrad@darnok.org>,
	Andres Lagar-Cavilla <andres@lagarcavilla.org>
Subject: Re: [PATCH V5] xen/privcmd.c: improve performance of MMAPBATCH_V2.
Date: Wed, 2 Jan 2013 16:45:46 +0000	[thread overview]
Message-ID: <50E4643A.2090003@citrix.com> (raw)
In-Reply-To: <716BE5EF-DAAC-47DE-A4FF-96D67219E4B3@gmail.com>

On 02/01/13 15:47, Andres Lagar-Cavilla wrote:
> Hi,
> On Dec 20, 2012, at 6:32 AM, Ian Campbell <ian.campbell@citrix.com> wrote:
>
>> On Thu, 2012-12-20 at 11:00 +0000, Mats Petersson wrote:
>>
>>>>> -       err = 0;
>>>>> +       err = last_err;
>>>> This means that if you have 100 failures followed by one success you
>>>> return success overall. Is that intentional? That doesn't seem right.
>>> As far as I see, it doesn't mean that. last_err is only set at the
>>> beginning of the call (to zero) and if there is an error.
>> Ah, yes I missed that (but it still isn't right, see below).
>>
>>>>> diff --git a/drivers/xen/privcmd.c b/drivers/xen/privcmd.c
>>>>> index 0bbbccb..8f86a44 100644
>>>>> --- a/drivers/xen/privcmd.c
>>>>> +++ b/drivers/xen/privcmd.c
>>>> [...]
>>>>> @@ -283,12 +317,10 @@ static int mmap_batch_fn(void *data, void *state)
>>>>>          if (xen_feature(XENFEAT_auto_translated_physmap))
>>>>>                  cur_page = pages[st->index++];
>>>>>
>>>>> -       ret = xen_remap_domain_mfn_range(st->vma, st->va & PAGE_MASK, *mfnp, 1,
>>>>> -                                        st->vma->vm_page_prot, st->domain,
>>>>> -                                        &cur_page);
>>>>> -
>>>>> -       /* Store error code for second pass. */
>>>>> -       *(st->err++) = ret;
>>>>> +       BUG_ON(nr < 0);
>>>>> +       ret = xen_remap_domain_mfn_array(st->vma, st->va & PAGE_MASK, mfnp, nr,
>>>>> +                                        st->err, st->vma->vm_page_prot,
>>>>> +                                        st->domain, &cur_page);
>>>>>
>>>>>          /* And see if it affects the global_error. */
>>>> The code which follows needs adjustment to cope with the fact that we
>>>> now batch. I think it needs to walk st->err and set global_error as
>>>> appropriate. This is related to my comment about the return value of
>>>> xen_remap_domain_mfn_range too.
>>> The return value, as commented above, is "either 0 or the last error we
>>> saw". There should be no need to walk the st->err, as we know if there
>>> was some error or not.
>> The code which follows tries to latch having seen ENOENT in preference
>> to other errors, so you don't need 0 or the last error, you need either
>> 0, ENOENT if one occurred somewhere in the batch or an error!=ENOENT.
>> (or maybe its vice versa, either way the last error isn't necessarily
>> what you need).
> Yeah the error reporting interface sucks. All I can say in my defense is that it was that way when I got here.
>
> FWIW EFAULT *takes* precedence over ENOENT.
> /* If we have not had any EFAULT-like global errors then set the global
>   * error to -ENOENT if necessary. */
>     if ((ret == 0) && (state.global_error == -ENOENT))
>           ret = -ENOENT;
>
> Otherwise, any individual mapping that fails differently from ENOENT does not affect the global error. That is, in absence of global errors like EFAULT, and in absence of per-mapping ENOENT return codes, per mapping errors like EINVAL do not affect the ioctl return code which remains zero.
>
> IIUC Ian has been pointing that your code breaks this last statement.
Sorry, are you saying the correct behaviour is to return 0 if we have 
EINVAL or some other error [aside from EFAULT and ENOENT]? Or are you 
saying that my code is broken such that it does this?

I did some fixes, that I believe "does the right thing", before 
Christmas, but since I knew most people wouldn't be available, I didn't 
post it. I will try to get round to it before the end of the week.

--
Mats
>
> Thanks
> Andres
>>>> I think rather than trying to fabricate some sort of meaningful error
>>>> code for an entire batch xen_remap_domain_mfn_range should just return
>>>> an indication about whether there were any errors or not and leave it to
>>>> the caller to figure out the overall result by looking at the err array.
>>>>
>>>> Perhaps return either the number of errors or the number of successes
>>>> (which turns the following if into either (ret) or (ret < nr)
>>>> respectively).
>>> I'm trying to not change how the code above expects things to work.
>>> Whilst it would be lovely to rewrite the entire code dealing with
>>> mapping memory, I don't think that's within the scope of my current
>>> project. And if I don't wish to rewrite all of libxc's memory management
>> I'm not talking about changing the libxc or ioctl interface, only about
>> the internal-to-Linux interface between privcmd and mmu.c. In fact I'm
>> only actually asking you to change the return value of the new function
>> you are adding to the API.
>>
>>> code, I don't want to alter what values are returned or when. The
>>> current code follows what WAS happening before my changes - which isn't
>>> exactly the most fantastic thing, and I think there may actually be bugs
>>> in there, such as:
>>>      if (ret < 0) {
>>>          if (ret == -ENOENT)
>>>              st->global_error = -ENOENT;
>>>          else {
>>>              /* Record that at least one error has happened. */
>>>              if (st->global_error == 0)
>>>                  st->global_error = 1;
>>>          }
>>>      }
>>> if we enter this once with -EFAULT, and then after that with -ENOENT,
>>> global_error will say -ENOENT. I think knowing that we got an EFAULT is
>>> "higher importance" than ENOENT, but that's how the old code was
>>> working, and I'm not sure I should change it at this point.
>> But I think you are changing it, by this behaviour or returning the last
>> error in the batch which causes this code to behave differently. That's
>> what I'm asking you to avoid!
>>
>>>>>          if (ret < 0) {
>>>>> @@ -300,7 +332,7 @@ static int mmap_batch_fn(void *data, void *state)
>>>>>                                  st->global_error = 1;
>>>>>                  }
>>>>>          }
>>>>> -       st->va += PAGE_SIZE;
>>>>> +       st->va += PAGE_SIZE * nr;
>>>>>
>>>>>          return 0;
>>>>>   }
>>>>> @@ -430,8 +462,8 @@ static long privcmd_ioctl_mmap_batch(void __user *udata, int version)
>>>>>          state.err           = err_array;
>>>>>
>>>>>          /* mmap_batch_fn guarantees ret == 0 */
>>>>> -       BUG_ON(traverse_pages(m.num, sizeof(xen_pfn_t),
>>>>> -                            &pagelist, mmap_batch_fn, &state));
>>>>> +       BUG_ON(traverse_pages_block(m.num, sizeof(xen_pfn_t),
>>>>> +                                   &pagelist, mmap_batch_fn, &state));
>>>> Can we make traverse_pages and _block common by passing a block size
>>>> parameter?
>>> Yes of course. Is there much benefit from that? I understand that it's
>>> less code, but it also makes the original traverse_pages more complex.
>>> Not convinced it helps much - it's quite a small function, so not much
>>> extra code. Additionally, all of the callback functions will have to
>>> deal with an extra parameter (that is probably ignored in all but one
>>> place).
>> It depends on what the combined version ends up looking like but in
>> general I'm in favour of one more generic function over several cloned
>> and hacked versions of the same thing.
>>
>> Ian.
>>

  reply	other threads:[~2013-01-02 16:45 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-12-19 18:53 [PATCH V5] xen/privcmd.c: improve performance of MMAPBATCH_V2 Mats Petersson
2012-12-20 10:40 ` Ian Campbell
2012-12-20 11:00   ` Mats Petersson
2012-12-20 11:32     ` Ian Campbell
2013-01-02 15:47       ` Andres Lagar-Cavilla
2013-01-02 16:45         ` Mats Petersson [this message]
2013-01-02 16:52           ` Andres Lagar-Cavilla
2012-12-20 11:26   ` Ian Campbell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=50E4643A.2090003@citrix.com \
    --to=mats.petersson@citrix.com \
    --cc=Ian.Campbell@citrix.com \
    --cc=JBeulich@suse.com \
    --cc=andres.lagarcavilla@gmail.com \
    --cc=andres@lagarcavilla.org \
    --cc=david.vrabel@citrix.com \
    --cc=konrad@darnok.org \
    --cc=mats@planetcatfish.com \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).