From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:36088) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TVLvk-0005HG-T6 for qemu-devel@nongnu.org; Mon, 05 Nov 2012 07:37:30 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1TVLvi-0004eb-Vd for qemu-devel@nongnu.org; Mon, 05 Nov 2012 07:37:28 -0500 Received: from david.siemens.de ([192.35.17.14]:18592) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1TVLvi-0004eC-Le for qemu-devel@nongnu.org; Mon, 05 Nov 2012 07:37:26 -0500 Message-ID: <5097B302.3080203@siemens.com> Date: Mon, 05 Nov 2012 13:37:22 +0100 From: Jan Kiszka MIME-Version: 1.0 References: <509627A2.3040509@web.de> <5096C044.7060504@redhat.com> <50975C27.5010906@web.de> <509774ED.6040104@redhat.com> <50977E1A.1070105@siemens.com> <5097B203.30207@redhat.com> In-Reply-To: <5097B203.30207@redhat.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH] memory: Reintroduce dirty flag to optimize changes on disabled regions List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Avi Kivity Cc: Blue Swirl , qemu-devel On 2012-11-05 13:33, Avi Kivity wrote: > On 11/05/2012 10:51 AM, Jan Kiszka wrote: >> On 2012-11-05 09:12, Avi Kivity wrote: >>> On 11/05/2012 08:26 AM, Jan Kiszka wrote: >>>> On 2012-11-04 20:21, Avi Kivity wrote: >>>>> On 11/04/2012 10:30 AM, Jan Kiszka wrote: >>>>>> From: Jan Kiszka >>>>>> >>>>>> Cirrus is triggering this, e.g. during Win2k boot: Changes only on >>>>>> disabled regions require no topology update when transaction depth drops >>>>>> to 0 again. >>>>> >>>>> 817dcc5368988b0 (pci: give each device its own address space) mad this >>>>> much worse by multiplying the number of address spaces. Each change is >>>>> now evaluated N+2 times, where N is the number of PCI devices. It also >>>>> causes a corresponding expansion in memory usage. >>>> >>>> I know... But this regression predates your changes, is already visible >>>> right after 02e2b95fb4. >>>> >>>>> >>>>> I want to address this by caching AddressSpaceDispatch trees with the >>>>> key being the contents of the FlatView for that address space. This >>>>> will drop the number of distinct trees to 2-4 (3 if some devices have >>>>> PCI_COMMAND_MASTER disabled, 4 if the PCI address space is different >>>>> from the cpu memory address space) but will fail if we make each address >>>>> space different (for example filtering out the device's own BARs). >>>>> >>>>> If this change also improves cpu usage sufficiently, then it will be >>>>> better than your patch, which doesn't recognize changes in an enabled >>>>> region inside a disabled or hidden region. >>>> >>>> True, though the question is how common such scenarios are. This one >>>> (cirrus with win2k) is already special. >>>> >>>>> In other words, your patch >>>>> fits the problem at hand but isn't general. On the other hand my >>>>> approach doesn't eliminate render_memory_region(), just the exec.c stuff >>>>> and listener updates. So we need to understand where the slowness comes >>>>> from. >>>> >>>> I would just like to have some even intermediate solution for 1.3. We >>>> can still make it more perfect later on if required. >>>> >>> >>> I think we should apply a v2 then, the more general optimizations will >>> take some time. >> >> OK - what should v2 do differently? >> > > As I noted, init and destroy cannot cause a topology update. Ah, right. Why are we wrapping them in transaction_begin/commit at all then? Jan -- Siemens AG, Corporate Technology, CT RTC ITP SDP-DE Corporate Competence Center Embedded Linux