From mboxrd@z Thu Jan  1 00:00:00 1970
From: Olaf Hering <olaf@aepfle.de>
Subject: Re: [PATCH 3 of 3] RFC: xenpaging: use waitqueue in
	ept_get
Date: Wed, 23 Nov 2011 17:37:28 +0100
Message-ID: <20111123163728.GA6000@aepfle.de>
References: <mailman.1126.1321998852.12970.xen-devel@lists.xensource.com>
	<b9f83375b8bf452e866311e3ec3fbbc0.squirrel@webmail.lagarcavilla.org>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Return-path: <xen-devel-bounces@lists.xensource.com>
Content-Disposition: inline
In-Reply-To: <b9f83375b8bf452e866311e3ec3fbbc0.squirrel@webmail.lagarcavilla.org>
List-Unsubscribe: <http://lists.xensource.com/mailman/options/xen-devel>,
	<mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xensource.com>
List-Help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-Subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
Sender: xen-devel-bounces@lists.xensource.com
Errors-To: xen-devel-bounces@lists.xensource.com
To: Andres Lagar-Cavilla <andres@lagarcavilla.org>
Cc: xen-devel@lists.xensource.com
List-Id: xen-devel@lists.xenproject.org

On Tue, Nov 22, Andres Lagar-Cavilla wrote:

> Hi Olaf,
> 
> thanks for posting these RFC patches, great work!
> 
> I have several comments. Most importantly, I want to hash out the
> interactions between these wait queues and the work I've been doing on p2m
> synchronization. I've been runnning Xen with synchronized (i.e. locking)
> p2m lookups for a few weeks now with little/no trouble. You can refer to a
> patch I posted to the list previously, which I can resend. (I'm waiting on
> feedback on some previous patches I sent to keep pushing on this work)
> 
> - I think the waitqueue should be part of struct arch_domain. It is an x86
> construct that applies only to the base level p2m of the domain, not every
> possible p2m in a nested setting.

Good point, I will move it. On the other hand, its current location cant
be the final solution. A wake_up would start all waiting vcpus, not just
the ones waiting for a certain gfn (in case of paging). There needs to
be a better way for selective wake_up.

> - The right place to wait is not ept->get_entry, imho, but rather the
> implementation-independent caller (get_gfn_type_access). More p2m
> implementations could be added in the future (however unlikely that may
> be) that can also perform paging.

The wait could probably be moved one level up, yes.

> - With locking p2m lookups, one needs to be careful about in_atomic. I
> haven't performed a full audit of all callers, but this is what I can say:
> 1. there are no code paths that perform recursive p2m lookups, *on
> different gfns*, with p2m_query_t != p2m_query
> 2. there are recursive lookups of different gfns, with p2m_query_t ==
> p2m_query, most notably pod sweeps. These do not represent a problem here,
> since those paths will skip over gfns that are paged.
> 
> Why is this important? Because we need to be in a position where a code
> snippet such as
> 
> get_gfn_type_access(gfn) {
> retry:
>    p2m_lock()
>    mfn = p2m->get_entry(gfn, &p2mt)
>    if (p2mt == paging) {
>        p2m_unlock()
>        sleep_on_waitqueue()
>        goto retry;
>    }
> 
> works. This will get us a non-racy p2m that sends vcpu's to sleep without
> holding locks. Makes sense?

Something like that can be done if needed, yes.

> - What is the plan for grant operations for pv-on-hvm drivers? Those will
> enter get_gfn* holding the domain lock, and thus in an atomic context.

Is that a new thing? So far my PVonHVM guests did not encounter that
issue. I see do_grant_table_op() taking the domain_lock, but is this
code path really entered from the guest or rather from dom0?
__get_paged_frame() returns GNTST_eagain, and that needs to be handled
by callers of do_grant_table_op. linux-2.6.18-xen.hg has code to retry
the grant operation in that case.

Olaf