From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:45198) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1T3N1q-0004FD-SF for qemu-devel@nongnu.org; Mon, 20 Aug 2012 04:08:07 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1T3N1q-0006l8-1D for qemu-devel@nongnu.org; Mon, 20 Aug 2012 04:08:06 -0400 Received: from mx1.redhat.com ([209.132.183.28]:11889) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1T3N1p-0006kq-Pg for qemu-devel@nongnu.org; Mon, 20 Aug 2012 04:08:05 -0400 Message-ID: <5031F060.1030602@redhat.com> Date: Mon, 20 Aug 2012 10:08:00 +0200 From: Paolo Bonzini MIME-Version: 1.0 References: <1345326543-10677-1-git-send-email-pbonzini@redhat.com> <50309BEE.3090602@profihost.ag> <5030E5F2.7060903@redhat.com> <50313D03.8040101@profihost.ag> <5031E5CC.7090306@redhat.com> <5031E86F.2000000@profihost.ag> In-Reply-To: <5031E86F.2000000@profihost.ag> Content-Type: text/plain; charset=ISO-8859-15 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH RFT 0/3] iscsi: fix NULL dereferences / races between task completion and abort List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Stefan Priebe - Profihost AG Cc: qemu-devel@nongnu.org, ronniesahlberg@gmail.com Il 20/08/2012 09:34, Stefan Priebe - Profihost AG ha scritto: >>> Booting works fine now. But the VM starts to hang after trying to unmap >>> large regions. No segfault or so just not reacting anymore. >> >> This is expected; unfortunately cancellation right now is a synchronous >> operation in the block layer. SCSI is the first big user of >> cancellation, and it would indeed benefit from asynchronous cancellation. >> >> Without these three patches, you risk corruption in case the following >> happens: >> >> qemu target >> ----------------------------------- >> send unmap --------> >> cancel unmap ------> >> send write --------> >> <---------------- complete write >> >> <---------------- complete unmap >> <---------------- cancellation done (unmap complete) > > mhm OK that makes sense. But i cannot even login via SSH That's because the "big QEMU lock" is held by the thread that called qemu_aio_cancel. > and i also see > no cancellation message in kernel log. And that's because the UNMAP actually ultimately succeeds. You'll probably see soft lockup messages though. The solution here is to bump the timeout of the UNMAP command (either in the kernel or in libiscsi, I didn't really understand who's at fault). Paolo