From mboxrd@z Thu Jan  1 00:00:00 1970
From: Anthony Liguori <aliguori@us.ibm.com>
Subject: Re: vram_dirty vs. shadow paging dirty tracking
Date: Tue, 13 Mar 2007 16:30:22 -0500
Message-ID: <45F717EE.5040900@us.ibm.com>
References: <45F6FC68.3040207@us.ibm.com>
	<8A87A9A84C201449A0C56B728ACF491E0B9DBF@liverpoolst.ad.cl.cam.ac.uk>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Return-path: <xen-devel-bounces@lists.xensource.com>
In-Reply-To: <8A87A9A84C201449A0C56B728ACF491E0B9DBF@liverpoolst.ad.cl.cam.ac.uk>
List-Unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xensource.com>
List-Help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-Subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
Sender: xen-devel-bounces@lists.xensource.com
Errors-To: xen-devel-bounces@lists.xensource.com
To: Ian Pratt <Ian.Pratt@cl.cam.ac.uk>
Cc: xen-devel@lists.xensource.com
List-Id: xen-devel@lists.xenproject.org

Ian Pratt wrote:
>> When thinking about multithreading the device model, it occurred to me
>> that it's a little odd that we're doing a memcmp to determine which
>> portions of the VRAM has changed.  Couldn't we just use dirty page
>> tracking in the shadow paging code?  That should significantly lower
>> the
>> overhead of this plus I believe the infrastructure is already mostly
>> there in the shadow2 code.
>>     
>
> Yep, its been in the roadmap doc for quite a while. However, the log
> dirty code isn't ideal for this. We'd need to extend it to enable it to
> be turned on for just a subset of the GFN range (we could use a xen
> rangeset for this).
>   

Okay, I was curious if the log dirty stuff could do ranges.  I guess not.

> Even so, I'm not super keen on the idea of tearing down and rebuilding
> 1024 PTE's up to 50 times a second. 
>
> A lower overhead solution would be to do scanning and resetting of the
> dirty bits on the PTEs (and a global tlb flush).

Right, this is the approach I was assuming.  There's really no use in 
tearing down the whole PTE (since you would have to take an extraneous 
read fault).

> In the general case
> this is tricky as the framebuffer could be mapped by multiple PTEs. In
> practice, I believe this doesn't happen for either Linux or Windows.
>   

I wouldn't think so, but showing my ignorance for a moment, does shadow2 
not provide a mechanism to lookup VA's given a GFN?  This lookup could 
be cheap if the structures are built during shadow page table construction.

Sounds like this is a good long term goal but I think I'll stick with 
the threading as an intermediate goal.

I've got a minor concern that threading isn't going to help us much when 
dom0 is UP since the VGA scanning won't happen while an MMIO/PIO request 
happens.  With an SMP dom0, you could potentially do all the VGA 
scanning on one processor ensuring that qemu-dm wasn't ever "busy" when 
a request occurs.  I'm slightly concerned though that having a thread 
that's as CPU hungry as the VGA scanning may increase context-switches 
during the MMIO/PIO handling which would actually hurt performance.

We'll see soon enough though.

Regards,

Anthony Liguori

> There's always a good fallback of just returning 'all dirty' if the
> heuristic is violated. Would be good to knock this up.
>
> Best,
> Ian
>