From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([140.186.70.92]:51906) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QoW22-0007Xc-Ih for qemu-devel@nongnu.org; Wed, 03 Aug 2011 03:38:23 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1QoW21-0007CO-BP for qemu-devel@nongnu.org; Wed, 03 Aug 2011 03:38:22 -0400 Received: from isrv.corpit.ru ([86.62.121.231]:55828) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QoW21-0007Bd-37 for qemu-devel@nongnu.org; Wed, 03 Aug 2011 03:38:21 -0400 Message-ID: <4E38FAE3.8070408@msgid.tls.msk.ru> Date: Wed, 03 Aug 2011 11:38:11 +0400 From: Michael Tokarev MIME-Version: 1.0 References: <20110719115511.609C553E5@gandalf.tls.msk.ru> <4E25FBA6.3060803@web.de> <20110720163450.GA17333@amt.cnet> <4E275169.9070402@web.de> In-Reply-To: <4E275169.9070402@web.de> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH] do not call monitor_resume() from migrate_fd_put_buffer() error path List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Jan Kiszka Cc: Marcelo Tosatti , qemu-devel@nongnu.org So, can we decide on this somehow? I don't see a code path where we don't call monitor_resume at the end, so the "intermediate" monitor_resume can be dropped. This way we fix real bug. If there will be other problem from that, it can be fixed later - this will mean that code path is found... Should I resend the initial patch again? Thanks, /mjt 21.07.2011 02:06, Jan Kiszka wriote: > On 2011-07-20 18:34, Marcelo Tosatti wrote: >> On Tue, Jul 19, 2011 at 11:48:22PM +0200, Jan Kiszka wrote: >>> On 2011-07-19 13:46, Michael Tokarev wrote: >>>> If we do, it results in double monitor_resume() (second being called >>>> from migrate_fd_cleanup() anyway) and monitor suspend count becoming >>>> negative. >>>> >>>> Cc'ing people from `git blame' list for the lines in question: the >>>> change fixes the problem but I'm not sure what the original intention >>>> of this code was in this place. Unfortunately noone replied to two >>>> my attempts to raise this issue. >>>> >>>> Signed-Off-By: Michael Tokarev >>>> --- >>>> migration.c | 3 --- >>>> 1 files changed, 0 insertions(+), 3 deletions(-) >>>> >>>> diff --git a/migration.c b/migration.c >>>> index af3a1f2..115588c 100644 >>>> --- a/migration.c >>>> +++ b/migration.c >>>> @@ -330,9 +330,6 @@ ssize_t migrate_fd_put_buffer(void *opaque, const void *data, size_t size) >>>> if (ret == -EAGAIN) { >>>> qemu_set_fd_handler2(s->fd, NULL, NULL, migrate_fd_put_notify, s); >>>> } else if (ret < 0) { >>>> - if (s->mon) { >>>> - monitor_resume(s->mon); >>>> - } >>>> s->state = MIG_STATE_ERROR; >>>> notifier_list_notify(&migration_state_notifiers); >>>> } >>> >>> Looks reasonable to me, but Marcelo should comment on this, specifically >>> which scenario once required the resume. >>> >>> Jan >> >> If the monitor was suspended (migrate without -d), then this path must >> resume. Should record that somewhere and check here. > > It's clear that we need to resume. The question is in which case > migrate_fd_cleanup may not be called after an error in > migrate_fd_put_buffer. > > Jan >