From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrew Cooper Subject: Re: libvirtd live-locking on CTX_LOCK when doing 'virsh save /tmp/blah' with guest corrupting memory (on purpose). Date: Fri, 10 Apr 2015 17:05:13 +0100 Message-ID: <5527F4B9.5040107@citrix.com> References: <20150408144730.GA16160@l.oracle.com> <55254D14.1010708@citrix.com> <20150410154451.GA18981@l.oracle.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20150410154451.GA18981@l.oracle.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: xen-devel@lists.xen.org List-Id: xen-devel@lists.xenproject.org On 10/04/15 16:44, Konrad Rzeszutek Wilk wrote: > On Wed, Apr 08, 2015 at 04:45:24PM +0100, Andrew Cooper wrote: >> On 08/04/15 15:47, Konrad Rzeszutek Wilk wrote: >>> Hey Jim, Andrew, and Ian, >>> >>> This is libvirt v1.2.14 + three patches: >>> c82a59b libxl: drop virDomainObj lock when destroying a domain >>> a1c9d30 libxl: acquire a job when destroying a domain >>> 5bd5406 libxl: Move job acquisition in libxlDomainStart to callers >>> >>> For fun I've set up an guest with PCI passthrough and tried to save it >>> (HAHAH) with an disastrous result (xc_save_helper was stuck). Probably >>> due to outstanding DMA operations wreaking havoc. >> Outstanding DMA wont make any difference. It isn't (and can't) be >> reflected in the logdirty bitmap, so libxc simply wont know about it. >> >> xc_save_helper is blocked because it has called back into the libxl with >> the suspend_and_state() callback. >> >> i.e. libxc has requested that libxl pause the domain, and that request >> is still outstanding. >> >> >> The vcpu trace from the very bottom shows that the guest has not yet >> paused itself. 1 vcpu is blocked in the hypervisor while the other look >> to be in some spinlock code. > Except the guest is in '---ss- ' so it _should_ be paused by now. You cannot trust this line for an HVM guest. It simply means that the toolstack has performed the remote_shutdown hypercall, not that the guest has finally stopped. ~Andrew