Re: [RFC] Overview of work required to implement mem_access for PV guests

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Andrew Cooper <andrew.cooper3@citrix.com>
To: "Aravindh Puthiyaparambil (aravindp)" <aravindp@cisco.com>
Cc: "xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>,
	"Tim Deegan (tim@xen.org)" <tim@xen.org>
Subject: Re: [RFC] Overview of work required to implement mem_access for PV guests
Date: Mon, 25 Nov 2013 20:18:15 +0000	[thread overview]
Message-ID: <5293B087.9060407@citrix.com> (raw)
In-Reply-To: <97A500D504438F4ABC02EBA81613CC6331702B35@xmb-aln-x02.cisco.com>

On 25/11/13 19:39, Aravindh Puthiyaparambil (aravindp) wrote:
>> On 25/11/13 07:49, Aravindh Puthiyaparambil (aravindp) wrote:
>>> The mem_access APIs only work with HVM guests that run on Intel
>> hardware with EPT support. This effort is to enable it for PV guests that run
>> with shadow page tables. To facilitate this, the following will be done:
>>
>> Are you sure that this is only Intel with EPT?  It looks to be a HAP feature,
>> which includes AMD with NPT support.
> Yes, mem_access is gated on EPT being available.
> http://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=xen/arch/x86/mm/mem_event.c;h=d00e4041b2bd099b850644db86449c8a235f0f5a;hb=HEAD#l586
>
> However, I think it is possible to implement this for NPT also.

So it is - I missed that.

>
>>> 1. A magic page will be created for the mem_access (mem_event) ring
>> buffer during the PV domain creation.
>>
>> Where is this magic page being created from? This will likely have to be at the
>> behest of the domain creation flags to avoid making it for the vast majority of
>> domains which wont want the extra overhead.
> This page will be similar to the console, xenstore and start_info pages.
> http://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=tools/libxc/xc_dom_x86.c;h=e034d62373c7a080864d1aefaa6a06412653c9af;hb=HEAD#l452
>
> I can definitely make it depend on a domain creation flag, however on the HVM side pages for all mem_events including mem_access are created by default.
> http://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=tools/libxc/xc_hvm_build_x86.c;h=q;hb=HEAD#l487
>
> So is it ok to have a domain creation flag just for mem_access for PV guests?

The start_info and xenstore pages are critical for a PV guest to boot,
and the console is fairly useful (although not essential).  These pages
belong to the guest and the guest has full read/write access and control
over the pages.

For HVM guests, the special pfns are hidden in the MMIO region, and have
no access by default.  HVM domains need to use add_to_physmap to get
access to a subset of the magic pages.

I do not think it is reasonable for a guest to be able to access its own
mem_access page, and I am not sure how best to prevent PV guests from
getting at it.

>
>>> 2. Most of the mem_event / mem_access functions and variable name are
>> HVM specific. Given that I am enabling it for PV; I will change the names to
>> something more generic. This also holds for the mem_access hypercalls,
>> which fall under HVM ops and do_hvm_op(). My plan is to make them a
>> memory op or a domctl.
>>
>> You cannot remove the hvmops.  That would break the hypervisor ABI.
>>
>> You can certainly introduce new (more generic) hypercalls, implement the
>> hvmop ones in terms of the new ones and mark the hvmop ones as
>> deprecated in the documentation.
> Sorry, I should have been more explicit in the above paragraph. I was planning on doing exactly what you have said. I will be adding a new hypercall interface for the PV guests; we can then use that for HVM also and keep the old hvm_op hypercall interface as an alias.
> I would do something similar on the tool stack side. Create xc_domain_*_access() or xc_*_access() and make them wrappers  that call xc_hvm_*_access() or vice-versa. Then move the functions to xc_domain.c or xc_mem_access.c. This way I am hoping the existing libxc APIs will still work.
>
> Thanks,
> Aravindh

Ah ok - that looks sensible overall.

~Andrew
>>
>>> 3. A new shadow option will be added called PG_mem_access. This mode is
>> basic shadow mode with the addition of a table that will track the access
>> permissions of each page in the guest.
>>> mem_access_tracker[gfmn] = access_type If there is a place where I can
>>> stash this in an existing structure, please point me at it.
>>> This will be enabled using xc_shadow_control() before attempting to enable
>> mem_access on a PV guest.
>>> 4. xc_mem_access_enable/disable(): Change the flow to allow mem_access
>> for PV guests running with PG_mem_access shadow mode.
>>> 5. xc_domain_set_access_required(): No change required
>>>
>>> 6. xc_(hvm)_set_mem_access(): This API has two modes, one if the start
>> pfn/gmfn is ~0ull, it takes it as a request to set default access. Here we will call
>> shadow_blow_tables() after recording the default access type for the
>> domain. In the mode where it is setting mem_access type for individual
>> gmfns, we will call a function that will drop the shadow for that individual
>> gmfn. I am not sure which function to call. Will
>> sh_remove_all_mappings(gmfn) do the trick? Please advise.
>>> The other issue here is that in the HVM case we could use
>> xc_hvm_set_mem_access(gfn, nr) and the permissions for the range gfn to
>> gfn+nr would be set. This won't be possible in the PV case as we are actually
>> dealing with mfns and mfn to mfn+nr need not belong to the same guest. But
>> given that setting *all* page access permissions are done implicitly when
>> setting default access, I think we can live with setting page permissions one at
>> a time as they are faulted in.
>>> 7. xc_(hvm)_get_mem_access(): This will return the access type for gmfn
> >from the mem_access_tracker table.
>>> 8. In sh_page_fault() perform access checks similar to
>> ept_handle_violation() / hvm_hap_nested_page_fault().
>>> 9. Hook in to _sh_propagate() and set up the L1 entries based on access
>> permissions. This will be similar to ept_p2m_type_to_flags(). I think I might
>> also have to hook in to the code that emulates page table writes to ensure
>> access permissions are honored there too.
>>> Please give feedback on the above.
>>>
>>> Thanks,
>>> Aravindh
>>>
>>>
>>> _______________________________________________
>>> Xen-devel mailing list
>>> Xen-devel@lists.xen.org
>>> http://lists.xen.org/xen-devel

next prev parent reply	other threads:[~2013-11-25 20:23 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-11-25  7:49 [RFC] Overview of work required to implement mem_access for PV guests Aravindh Puthiyaparambil (aravindp)
2013-11-25 10:47 ` Andrew Cooper
2013-11-25 19:39   ` Aravindh Puthiyaparambil (aravindp)
2013-11-25 20:18     ` Andrew Cooper [this message]
2013-11-25 20:29       ` Aravindh Puthiyaparambil (aravindp)
2013-11-26 10:01 ` Tim Deegan
2013-11-26 18:19   ` Aravindh Puthiyaparambil (aravindp)
     [not found] <mailman.2274.1385490451.24322.xen-devel@lists.xen.org>
2013-11-26 18:41 ` Andres Lagar-Cavilla
2013-11-26 19:46   ` Aravindh Puthiyaparambil (aravindp)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5293B087.9060407@citrix.com \
    --to=andrew.cooper3@citrix.com \
    --cc=aravindp@cisco.com \
    --cc=tim@xen.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.