From mboxrd@z Thu Jan  1 00:00:00 1970
From: Andrew Cooper <andrew.cooper3@citrix.com>
Subject: Re: [PATCH 1/2] x86/mem-sharing: Bulk mem-sharing
	entire domains
Date: Mon, 5 Oct 2015 16:39:53 +0100
Message-ID: <561299C9.3060501@citrix.com>
References: <1443990339-19590-1-git-send-email-tamas@tklengyel.com>
	<1443990339-19590-2-git-send-email-tamas@tklengyel.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Return-path: <xen-devel-bounces@lists.xen.org>
Received: from mail6.bemta3.messagelabs.com ([195.245.230.39])
	by lists.xen.org with esmtp (Exim 4.72)
	(envelope-from <prvs=7136ef35e=Andrew.Cooper3@citrix.com>)
	id 1Zj7s7-0001Gu-3R
	for xen-devel@lists.xenproject.org; Mon, 05 Oct 2015 15:40:15 +0000
In-Reply-To: <1443990339-19590-2-git-send-email-tamas@tklengyel.com>
List-Unsubscribe: <http://lists.xen.org/cgi-bin/mailman/options/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xen.org>
List-Help: <mailto:xen-devel-request@lists.xen.org?subject=help>
List-Subscribe: <http://lists.xen.org/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=subscribe>
Sender: xen-devel-bounces@lists.xen.org
Errors-To: xen-devel-bounces@lists.xen.org
To: Tamas K Lengyel <tamas@tklengyel.com>, xen-devel@lists.xenproject.org
Cc: Wei Liu <wei.liu2@citrix.com>, Ian Campbell <ian.campbell@citrix.com>, Stefano Stabellini <stefano.stabellini@eu.citrix.com>, George Dunlap <george.dunlap@eu.citrix.com>, Ian Jackson <ian.jackson@eu.citrix.com>, Jan Beulich <jbeulich@suse.com>, Tamas K Lengyel <tlengyel@novetta.com>, Keir Fraser <keir@xen.org>
List-Id: xen-devel@lists.xenproject.org

On 04/10/15 21:25, Tamas K Lengyel wrote:
> Currently mem-sharing can be performed on a page-by-page base from the control
> domain. However, when completely deduplicating (cloning) a VM, this requires
> at least 3 hypercalls per page. As the user has to loop through all pages up
> to max_gpfn, this process is very slow and wasteful.

Indeed.

>
> This patch introduces a new mem_sharing memop for bulk deduplication where
> the user doesn't have to separately nominate each page in both the source and
> destination domain, and the looping over all pages happen in the hypervisor.
> This significantly reduces the overhead of completely deduplicating entire
> domains.

Looks good in principle.

> @@ -1467,6 +1503,56 @@ int mem_sharing_memop(XEN_GUEST_HANDLE_PARAM(xen_mem_sharing_op_t) arg)
>          }
>          break;
>  
> +        case XENMEM_sharing_op_bulk_dedup:
> +        {
> +            unsigned long max_sgfn, max_cgfn;
> +            struct domain *cd;
> +
> +            rc = -EINVAL;
> +            if ( !mem_sharing_enabled(d) )
> +                goto out;
> +
> +            rc = rcu_lock_live_remote_domain_by_id(mso.u.share.client_domain,
> +                                                   &cd);
> +            if ( rc )
> +                goto out;
> +
> +            rc = xsm_mem_sharing_op(XSM_DM_PRIV, d, cd, mso.op);
> +            if ( rc )
> +            {
> +                rcu_unlock_domain(cd);
> +                goto out;
> +            }
> +
> +            if ( !mem_sharing_enabled(cd) )
> +            {
> +                rcu_unlock_domain(cd);
> +                rc = -EINVAL;
> +                goto out;
> +            }
> +
> +            max_sgfn = domain_get_maximum_gpfn(d);
> +            max_cgfn = domain_get_maximum_gpfn(cd);
> +
> +            if ( max_sgfn != max_cgfn || max_sgfn < start_iter )
> +            {
> +                rcu_unlock_domain(cd);
> +                rc = -EINVAL;
> +                goto out;
> +            }
> +
> +            rc = bulk_share(d, cd, max_sgfn, start_iter, MEMOP_CMD_MASK);
> +            if ( rc > 0 )
> +            {
> +                ASSERT(!(rc & MEMOP_CMD_MASK));

The way other continuations like this work is to shift the remaining
work left by MEMOP_EXTENT_SHIFT.

This avoids bulk_share() needing to know MEMOP_CMD_MASK, but does chop 6
bits off the available max_sgfn.

However, a better alternative would be to extend xen_mem_sharing_op and
stash the continue information in a new union.  That would avoid the
mask games, and also avoid limiting the maximum potential gfn.

~Andrew