From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:46462) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZwquV-00045G-1x for qemu-devel@nongnu.org; Thu, 12 Nov 2015 07:23:28 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZwquR-0001EO-N6 for qemu-devel@nongnu.org; Thu, 12 Nov 2015 07:23:27 -0500 Received: from mx1.redhat.com ([209.132.183.28]:42620) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZwquR-0001E8-G8 for qemu-devel@nongnu.org; Thu, 12 Nov 2015 07:23:23 -0500 Date: Thu, 12 Nov 2015 12:23:19 +0000 From: "Dr. David Alan Gilbert" Message-ID: <20151112122318.GF2754@work-vm> References: <1447165546-27784-1-git-send-email-quintela@redhat.com> <1447165546-27784-43-git-send-email-quintela@redhat.com> <20151112120443.GE2754@work-vm> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Subject: Re: [Qemu-devel] [PULL 42/57] Page request: Consume pages off the post-copy queue List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Peter Maydell Cc: Amit Shah , QEMU Developers , Juan Quintela * Peter Maydell (peter.maydell@linaro.org) wrote: > On 12 November 2015 at 12:04, Dr. David Alan Gilbert > wrote: > > * Peter Maydell (peter.maydell@linaro.org) wrote: > >> On 10 November 2015 at 14:25, Juan Quintela wrote: > >> > From: "Dr. David Alan Gilbert" > >> > > >> > When transmitting RAM pages, consume pages that have been queued by > >> > MIG_RPCOMM_REQPAGE commands and send them ahead of normal page scanning. > >> > > >> > Note: > >> > a) After a queued page the linear walk carries on from after the > >> > unqueued page; there is a reasonable chance that the destination > >> > was about to ask for other closeby pages anyway. > >> > > >> > b) We have to be careful of any assumptions that the page walking > >> > code makes, in particular it does some short cuts on its first linear > >> > walk that break as soon as we do a queued page. > >> > > >> > c) We have to be careful to not break up host-page size chunks, since > >> > this makes it harder to place the pages on the destination. > >> > > >> > Signed-off-by: Dr. David Alan Gilbert > >> > Reviewed-by: Juan Quintela > >> > Signed-off-by: Juan Quintela > >> > >> I've just discovered that this is causing 'make check' failures on > >> my OSX host (unfortunately something in my setup is causing > >> 'make check' failures to not always cause a build failure, so I > >> didn't notice earlier): > > > > It's only failing on OSX? Every time or only sometimes? > > Only OSX, and always. I think OSX is pickier about mutexes really > needing to be initialized before use. OK, at least an 'always' should be easier to debug. > > If you can find a way to get a backtrace off that qemu_mutex_lock case > > that would be great; I'd assume the later errors are the fall out from that. > > I'll have a look after lunch, but it's usually painful to get a > backtrace out of this kind of qtest, because it's clearly starting > a whole pile of QEMUs and there's no way I know of to say "only > run a few of these tests, not the whole huge pile". You could add an abort/assert into util/qemu-thread-posix.c qemu_mutex_lock in the error path. Could you also add: diff --git a/migration/migration.c b/migration/migration.c index 9bd2ce7..85e5766 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -93,6 +93,7 @@ MigrationState *migrate_get_current(void) }; if (!once) { + fprintf(stderr,"migrate_get_current do init of current_migration %d\n", getpid()); qemu_mutex_init(¤t_migration.src_page_req_mutex); once = true; } diff --git a/migration/ram.c b/migration/ram.c index 4266687..72b46f2 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -1036,6 +1036,7 @@ static RAMBlock *unqueue_page(MigrationState *ms, ram_addr_t *offset, { RAMBlock *block = NULL; + fprintf(stderr,"unqueue_page %d\n", getpid()); qemu_mutex_lock(&ms->src_page_req_mutex); if (!QSIMPLEQ_EMPTY(&ms->src_page_requests)) { struct MigrationSrcPageRequest *entry = and make sure that the init happens before the first unqueue (you'll get loads of calls to unqueue). Dave > > thanks > -- PMM -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK