xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Olaf Hering <olaf@aepfle.de>
To: Andres Lagar-Cavilla <andres@lagarcavilla.org>
Cc: andres@gridcentric.ca, xen-devel@lists.xensource.com,
	tim@xen.org, adin@gridcentric.ca
Subject: Re: [PATCH 1 of 8] x86/mm: Fix paging_load
Date: Thu, 26 Jan 2012 13:43:04 +0100	[thread overview]
Message-ID: <20120126124304.GA3355@aepfle.de> (raw)
In-Reply-To: <251543dc172738014ff5bd13c5614c3d.squirrel@webmail.lagarcavilla.org>

On Thu, Jan 26, Andres Lagar-Cavilla wrote:

> > On Thu, Jan 26, Andres Lagar-Cavilla wrote:
> >
> >> Now, afaict, the p2m_ram_paging_in state is not needed anymore. Can you
> >> provide feedback as to whether
> >> 1. remove p2m_ram_paging_in
> >> 2. rename p2m_ram_paging_in_start to p2m_ram_paging_in
> >>
> >> sounds like a good plan?
> >
> > In my opinion the common case is that evicted pages get populated, an
> > request is sent. Later an response is expected to make room in the ring.
> >
> > If p2m_mem_paging_populate allocates a page for the guest, it can let
> > the pager know that it did so (or failed to allocate one).
> > If there is a page already, the pager can copy the gfn content into a
> > buffer, put a pointer to it in the response and let
> > p2m_mem_paging_resume() handle both the ring accounting (as it does now)
> > and also the copy_from_user.
> 
> So, this would bounce the page contents twice for the case when the page
> hasn't yet been evicted?

If it was not evicted, then nothing has to be done. In theory the page
could be resumed right away. But you wanted the notification, so a
kind-of dummy request has to be sent. The pager itself will figure out
quickly that the page was not evicted yet, so all it can do is to send a
dummy response.
I'm not sure what you mean with bouncing it twice. The pager itself has
likely written the page, so some IO happend. The hypervisor will find a
mfn for the gfn, right now it does nothing but doing p2mt changes.

> > If page allocation failed, the pager has to allocate one via
> > p2m_mem_paging_prep() as it is done now, as an intermediate step.
> 
> The issue of failed allocations is more pervasive. It also affects mem
> sharing. And even PoD. What I'm trying to say is that even though your
> solution seems to work (as long as the pager does dom0 ballooning to free
> up some memory in between populate and prep!), we need a more generic
> mechanism. Something along the lines of an ENOMEM mem_event ring, and a
> matching dom0 daemon.

I'm not sure how all that relates to putting an alloc_domheap_page()
into p2m_mem_paging_populate() and let the pager know about the results.

> >
> > The buffer page handling in the pager is probably simple, it needs to
> > maintain RING_SIZE() buffers. There cant be more than that in flight
> > because thats the limit of requests as well. In other words, the pager
> > does not need to wait for p2m_mem_paging_resume() to run and pull the
> > buffer content.
> >
> >
> > If the "populate - allocate - put_request - get_request - fill_buffer -
> > put_response - resume  get_response - copy_from_buffer - resume_vcpu"
> > cycle works, it would reduce the overall amount of work to be done
> > during paging, even if the hypercalls itself are not the bottleneck.
> > It all depends on the possibility to allocate a page in the various
> > contexts where p2m_mem_paging_populate is called.
> 
> The gist here is that the paging_load call would be removed?

No, it is required if alloc_domheap_page() in
p2m_mem_paging_populate() fails. Then the pager has to take the "slow
path" and try until it gets a page, like it does now.

> I like the general direction, but one excellent property of paging_resume
> is that it doesn't fail. This is particularly important since we already
> do resumes via ring responses and event channel kicks (see below). So, if
> resume needs to propagate failures back to the pager, things get icky.
> 
> (paging_load is restartable, see other email)

Once the request is sent, all what can happen is that the copy_from_user
fails. And in that case the guest is in an undefined state. Perhaps such
copy_from_user handling can be fixed to be reliable. In that case of
course my idea wont work.

> > The resume part could be done via eventchannel and
> > XEN_DOMCTL_MEM_EVENT_OP_PAGING_RESUME could be removed.
> 
> This is already the case. I'm not eager to remove the domctl, but resuming
> via event channel kick only is in place.

It looks to me like resume is currently called twice in xenpaging, once
per ioctl and once per event channel.

> > Also the question is if freeing one p2mt is more important than reducing
> > the number if hypercalls to execute at runtime.
> 
> Agreed. However, eliminating code complexity is also useful, and these two
> ram_paging_in states cause everyone headaches.

Well, its not so alien to me after starring at and tracing it long
enough.

Olaf

  reply	other threads:[~2012-01-26 12:43 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-01-26  3:53 [PATCH 0 of 8] x86/mm fixes Andres Lagar-Cavilla
2012-01-26  3:53 ` [PATCH 1 of 8] x86/mm: Fix paging_load Andres Lagar-Cavilla
2012-01-26  9:46   ` Olaf Hering
2012-01-26 10:49     ` Andres Lagar-Cavilla
2012-01-26 12:05       ` Olaf Hering
2012-01-26 12:23         ` Andres Lagar-Cavilla
2012-01-26 12:43           ` Olaf Hering [this message]
2012-01-26  3:53 ` [PATCH 2 of 8] x86/mm: Fix p2m teardown locking Andres Lagar-Cavilla
2012-01-26  3:53 ` [PATCH 3 of 8] x86/mm: Allow foreign read-only mappings of shared pages Andres Lagar-Cavilla
2012-01-26  3:53 ` [PATCH 4 of 8] x86/mm: Output domain count of paged pages in console Andres Lagar-Cavilla
2012-01-26  9:47   ` Olaf Hering
2012-01-26  3:53 ` [PATCH 5 of 8] x86/mm: Remove stale variable from debugtrace printk in p2m audit Andres Lagar-Cavilla
2012-01-26  3:53 ` [PATCH 6 of 8] x86/mm: Properly account for paged out pages Andres Lagar-Cavilla
2012-01-26  9:54   ` Olaf Hering
2012-01-26 10:47     ` Andres Lagar-Cavilla
2012-01-26 12:11       ` Olaf Hering
2012-01-26 12:26         ` Andres Lagar-Cavilla
2012-01-26 13:08           ` Olaf Hering
2012-01-26  3:53 ` [PATCH 7 of 8] x86/mm: clean use of p2m unlocked queries Andres Lagar-Cavilla
2012-01-26  3:53 ` [PATCH 8 of 8] x86/mm: Avoid spurious deadlock panic trigger Andres Lagar-Cavilla
2012-01-26 13:31 ` [PATCH 0 of 8] x86/mm fixes Tim Deegan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120126124304.GA3355@aepfle.de \
    --to=olaf@aepfle.de \
    --cc=adin@gridcentric.ca \
    --cc=andres@gridcentric.ca \
    --cc=andres@lagarcavilla.org \
    --cc=tim@xen.org \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).