Re: [PATCH v5] x86/p2m: use large pages for MMIO mappings

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Andrew Cooper <andrew.cooper3@citrix.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: Kevin Tian <kevin.tian@intel.com>, Wei Liu <wei.liu2@citrix.com>,
	Ian Campbell <ian.campbell@citrix.com>,
	Stefano Stabellini <stefano.stabellini@eu.citrix.com>,
	George Dunlap <George.Dunlap@eu.citrix.com>,
	Tim Deegan <tim@xen.org>, Ian Jackson <Ian.Jackson@eu.citrix.com>,
	Jun Nakajima <jun.nakajima@intel.com>,
	xen-devel <xen-devel@lists.xenproject.org>,
	Keir Fraser <keir@xen.org>
Subject: Re: [PATCH v5] x86/p2m: use large pages for MMIO mappings
Date: Wed, 27 Jan 2016 14:28:17 +0000	[thread overview]
Message-ID: <56A8D401.6080100@citrix.com> (raw)
In-Reply-To: <56A8D61202000078000CB8EF@prv-mh.provo.novell.com>

On 27/01/16 13:37, Jan Beulich wrote:
>>>> On 27.01.16 at 13:32, <andrew.cooper3@citrix.com> wrote:
>> On 25/01/16 16:18, Jan Beulich wrote:
>>> --- a/xen/arch/x86/hvm/vmx/vmx.c
>>> +++ b/xen/arch/x86/hvm/vmx/vmx.c
>>> @@ -2491,7 +2491,7 @@ static int vmx_alloc_vlapic_mapping(stru
>>>      share_xen_page_with_guest(pg, d, XENSHARE_writable);
>>>      d->arch.hvm_domain.vmx.apic_access_mfn = mfn;
>>>      set_mmio_p2m_entry(d, paddr_to_pfn(APIC_DEFAULT_PHYS_BASE), _mfn(mfn),
>>> -                       p2m_get_hostp2m(d)->default_access);
>>> +                       PAGE_ORDER_4K, p2m_get_hostp2m(d)->default_access);
>>>  
>> This should ASSERT() success, in case we make further changes to the
>> error handling.
> Maybe, but since it didn't before I don't see why this couldn't /
> shouldn't be an independent future patch.

Can be.  IMO it is a bug that it isn't already checked.  (-ENOMEM when
allocating p2m leaves perhaps?)

>
>>> --- a/xen/arch/x86/mm/p2m.c
>>> +++ b/xen/arch/x86/mm/p2m.c
>>> @@ -899,48 +899,62 @@ void p2m_change_type_range(struct domain
>>>      p2m_unlock(p2m);
>>>  }
>>>  
>>> -/* Returns: 0 for success, -errno for failure */
>>> +/*
>>> + * Returns:
>>> + *    0        for success
>>> + *    -errno   for failure
>>> + *    order+1  for caller to retry with order (guaranteed smaller than
>>> + *             the order value passed in)
>>> + */
>>>  static int set_typed_p2m_entry(struct domain *d, unsigned long gfn, mfn_t mfn,
>>> -                               p2m_type_t gfn_p2mt, p2m_access_t access)
>>> +                               unsigned int order, p2m_type_t gfn_p2mt,
>>> +                               p2m_access_t access)
>>>  {
>>>      int rc = 0;
>>>      p2m_access_t a;
>>>      p2m_type_t ot;
>>>      mfn_t omfn;
>>> +    unsigned int cur_order = 0;
>>>      struct p2m_domain *p2m = p2m_get_hostp2m(d);
>>>  
>>>      if ( !paging_mode_translate(d) )
>>>          return -EIO;
>>>  
>>> -    gfn_lock(p2m, gfn, 0);
>>> -    omfn = p2m->get_entry(p2m, gfn, &ot, &a, 0, NULL, NULL);
>>> +    gfn_lock(p2m, gfn, order);
>>> +    omfn = p2m->get_entry(p2m, gfn, &ot, &a, 0, &cur_order, NULL);
>>> +    if ( cur_order < order )
>>> +    {
>>> +        gfn_unlock(p2m, gfn, order);
>>> +        return cur_order + 1;
>> Your comment states that the return value is guarenteed to be less than
>> the passed-in order, but this is not the case here.  cur_order could, in
>> principle, be only 1 less than order, at which point your documentation
>> is incorrect.
>>
>> Does this rely on the x86 architectural orders to function as documented?
> No. Maybe the comment text is ambiguous, but I don't see how to
> improve it without making it too lengthy: The return value is
> <order>+1, telling the caller to retry with <order>, which is
> guaranteed to be less than the order that got passed in. I.e. taking
> the variable naming above, the caller would have to retry with
> cur_order, which - due to the if() - is smaller than order.

Ah - I see.  The text is indeed confusing.  How about:

"1 + new order: for caller to retry with smaller order (guaranteed to be
smaller than order passed in)"

>
>>> +    }
>>>      if ( p2m_is_grant(ot) || p2m_is_foreign(ot) )
>>>      {
>>> -        gfn_unlock(p2m, gfn, 0);
>>> +        gfn_unlock(p2m, gfn, order);
>>>          domain_crash(d);
>>>          return -ENOENT;
>>>      }
>>>      else if ( p2m_is_ram(ot) )
>>>      {
>>> +        unsigned long i;
>>> +
>>>          ASSERT(mfn_valid(omfn));
>> Shouldn't this check should be extended to the top of the order?
> Well, yes, perhaps better to move it into ...
>
>>> -        set_gpfn_from_mfn(mfn_x(omfn), INVALID_M2P_ENTRY);
>>> +        for ( i = 0; i < (1UL << order); ++i )
>>> +            set_gpfn_from_mfn(mfn_x(omfn) + i, INVALID_M2P_ENTRY);
> ... the body of the for(). But I'll wait with v6 until we settled on
> the other aspects you raise.
>
>>>  int set_mmio_p2m_entry(struct domain *d, unsigned long gfn, mfn_t mfn,
>>> -                       p2m_access_t access)
>>> +                       unsigned int order, p2m_access_t access)
>>>  {
>>> -    return set_typed_p2m_entry(d, gfn, mfn, p2m_mmio_direct, access);
>>> +    if ( order &&
>>> +         rangeset_overlaps_range(mmio_ro_ranges, mfn_x(mfn),
>>> +                                 mfn_x(mfn) + (1UL << order) - 1) &&
>>> +         !rangeset_contains_range(mmio_ro_ranges, mfn_x(mfn),
>>> +                                  mfn_x(mfn) + (1UL << order) - 1) )
>>> +        return order;
>> Should this not be a hard error?  Even retrying with a lower order is
>> going fail.
> Why? The latest when order == 0, rangeset_overlaps_range()
> will return the same as rangeset_contains_range(), and hence
> the condition above will always be false (one of the two reasons
> for checking order first here).

It isn't the order check which is an issue.

One way or another, if the original (mfn/order) fails the rangeset
checks, the overall call is going to fail, but it will be re-executed
repeatedly with an order decreasing to 0.  Wouldn't it be better just to
short-circuit this back&forth?

Relatedly, is there actually anything wrong with making a superpage
read-only mapping over some scattered read-only 4K pages?

~Andrew

next prev parent reply	other threads:[~2016-01-27 14:28 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-22 15:42 [PATCH v4] x86/p2m: use large pages for MMIO mappings Jan Beulich
2016-01-25 12:16 ` Ian Campbell
2016-01-25 13:54   ` Jan Beulich
2016-01-25 14:05     ` Ian Campbell
2016-01-25 14:16       ` Jan Beulich
2016-01-25 14:21         ` Ian Campbell
2016-01-25 16:18           ` [PATCH v5] " Jan Beulich
2016-01-25 17:18             ` Ian Campbell
2016-01-26 22:35             ` Tian, Kevin
2016-01-27 10:22               ` Jan Beulich
2016-01-27 10:28                 ` Andrew Cooper
2016-01-27 12:32             ` Andrew Cooper
2016-01-27 13:37               ` Jan Beulich
2016-01-27 14:28                 ` Andrew Cooper [this message]
2016-01-27 14:40                   ` Jan Beulich
2016-01-27 14:51                     ` Andrew Cooper
2016-01-27 15:20                       ` Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56A8D401.6080100@citrix.com \
    --to=andrew.cooper3@citrix.com \
    --cc=George.Dunlap@eu.citrix.com \
    --cc=Ian.Jackson@eu.citrix.com \
    --cc=JBeulich@suse.com \
    --cc=ian.campbell@citrix.com \
    --cc=jun.nakajima@intel.com \
    --cc=keir@xen.org \
    --cc=kevin.tian@intel.com \
    --cc=stefano.stabellini@eu.citrix.com \
    --cc=tim@xen.org \
    --cc=wei.liu2@citrix.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.