From mboxrd@z Thu Jan 1 00:00:00 1970 From: Tim Deegan Subject: Re: [PATCH] x86/hap: fix race condition between ENABLE_LOGDIRTY and track_dirty_vram hypercall Date: Thu, 6 Dec 2012 09:32:06 +0000 Message-ID: <20121206093206.GA82725@ocelot.phlegethon.org> References: <50B7087D.20407@jp.fujitsu.com> <20121129154041.GD80627@ocelot.phlegethon.org> <048EAD622912254A9DEA24C1734613C18C87CC821E@FTLPMAILBOX02.citrite.net> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Content-Disposition: inline In-Reply-To: <048EAD622912254A9DEA24C1734613C18C87CC821E@FTLPMAILBOX02.citrite.net> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Robert Phillips Cc: Kouya Shimura , "xen-devel@lists.xen.org" List-Id: xen-devel@lists.xenproject.org Hi, At 12:59 -0500 on 03 Dec (1354539567), Robert Phillips wrote: > > Robert, in your patch you do wrap this all in the paging_lock, but then > > unlock to call various enable and disable routines. Is there a version > > of this race condition there, where some other CPU might call > > LOG_DIRTY_ENABLE while you've temporarily dropped the lock? > > My proposed patch does not modify the problematic locking code so, > unfortunately, it preserves the race condition that Kouya Shimura > has discovered. > > I question whether his proposed patch would be suitable for the > multiple frame buffer situation that my proposed patch addresses. > It is possible that a guest might be updating its frame buffers when > live migration starts, and the same race would result. > > I think the domain.arch.paging.log_dirty function pointers are problematic. > They are modified and executed without benefit of locking. > > I am uncomfortable with adding another lock. > > I will look at updating my patch to avoid the race and will (hopefully) > avoid adding another lock. Thanks. I think the paging_lock can probably cover everything we need here. These are toolstack operations and should be fairly rare, and HAP can do most of its work without the paging_lock. Also, in the next version can you please update this section: +int hap_track_dirty_vram(struct domain *d, + unsigned long begin_pfn, + unsigned long nr, + XEN_GUEST_HANDLE_64(uint8) guest_dirty_bitmap) +{ + long rc = 0; + dv_dirty_vram_t *dirty_vram; + + paging_lock(d); + dirty_vram = d->arch.hvm_domain.dirty_vram; + if ( nr ) + { + dv_range_t *range = NULL; + int size = ( nr + BITS_PER_LONG - 1 ) & ~( BITS_PER_LONG - 1 ); + uint8_t dirty_bitmap[size]; not to allocate a guest-specified amount of stack memory. This is one of the things recently found and fixed in the existing code as XSA-27. http://xenbits.xen.org/hg/staging/xen-4.1-testing.hg/rev/53ef1f35a0f8 Cheers, Tim.