From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:33427) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZrO7v-0004Fz-Cw for qemu-devel@nongnu.org; Wed, 28 Oct 2015 06:38:44 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZrO7u-0007yG-8Z for qemu-devel@nongnu.org; Wed, 28 Oct 2015 06:38:43 -0400 References: <1445954986-13005-1-git-send-email-den@openvz.org> <1445954986-13005-5-git-send-email-den@openvz.org> <8737wvb8w0.fsf@neno.neno> From: "Denis V. Lunev" Message-ID: <5630A59E.4010906@openvz.org> Date: Wed, 28 Oct 2015 13:38:22 +0300 MIME-Version: 1.0 In-Reply-To: <8737wvb8w0.fsf@neno.neno> Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH 4/5] migration: add missed aio_context_acquire into hmp_savevm/hmp_delvm List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: quintela@redhat.com Cc: Amit Shah , Paolo Bonzini , qemu-devel@nongnu.org, Stefan Hajnoczi , qemu-stable@nongnu.org On 10/28/2015 01:11 PM, Juan Quintela wrote: > "Denis V. Lunev" wrote: >> aio_context should be locked in the similar way as was done in QMP >> snapshot creation in the other case there are a lot of possible >> troubles if native AIO mode is enabled for disk. >> >> - the command can hang (HMP thread) with missed wakeup (the operation is >> actually complete) >> io_submit >> ioq_submit >> laio_submit >> raw_aio_submit >> raw_aio_readv >> bdrv_co_io_em >> bdrv_co_readv_em >> bdrv_aligned_preadv >> bdrv_co_do_preadv >> bdrv_co_do_readv >> bdrv_co_readv >> qcow2_co_readv >> bdrv_aligned_preadv >> bdrv_co_do_pwritev >> bdrv_rw_co_entry >> >> - QEMU can assert in coroutine re-enter >> __GI_abort >> qemu_coroutine_enter >> bdrv_co_io_em_complete >> qemu_laio_process_completion >> qemu_laio_completion_bh >> aio_bh_poll >> aio_dispatch >> aio_poll >> iothread_run >> >> AioContext lock is reqursive. Thus nested locking should not be a problem. >> >> Signed-off-by: Denis V. Lunev >> CC: Stefan Hajnoczi >> CC: Paolo Bonzini >> CC: Juan Quintela >> CC: Amit Shah >> --- >> block/snapshot.c | 5 +++++ >> migration/savevm.c | 7 +++++++ >> 2 files changed, 12 insertions(+) > > > Reviewed-by: Juan Quintela > > But once there, I can't understand why migration have to know about > aio_contexts at all. > > I *think* that it would be a good idea to "hide" the > adi_context_acquire(aio_context) inside qemu_fopen_bdrv() (yes, it is > still in migration/*, but you get the idea). But don't propose it, > because we don't have qemu_fclose_bdrv(). Yes we could add an > aio_context inside QemuFile, and release it on qemu_fclose(), but I > guess this needs more thought yet. > > BTW, once that I got your attention, why is this needed on hmp_savevm() > but it is not needed on load_vmstate()? We are also using > qemu_fopen_bdrv()? Because we are only reading from there? Just curios > the reason or if we are missing something there. > > Thanks, Juan. I think that the race is still there (I have checked this several times but less amount of times then create/delete snapshot) but the windows is seriously reduced due to bdrv_drain_all at the beginning. In general your are right. But in this case we are almost doomed. Any single read/write operation could executed in iothread only. May be I have missed something in this puzzle. OK. What about bdrv_fclose callback and similar (new) callback for open which should be called through qemu_fopen_bdrv for block driver only? Den