From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B1597C74A5B for ; Sun, 26 Mar 2023 16:50:27 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pgTYg-0003Ia-AQ; Sun, 26 Mar 2023 12:49:30 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pgTYf-0003IS-4V for qemu-devel@nongnu.org; Sun, 26 Mar 2023 12:49:29 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pgTYc-0005MK-U1 for qemu-devel@nongnu.org; Sun, 26 Mar 2023 12:49:28 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1679849365; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=vbKXcmm5r4824ryo074qPzqnawegFkMefVsqGO5Vp1c=; b=Fmtoit/+U8ljzDwMiM58nRnCaoXQaGk38BTFO8vScEbMhDlky4wprAcZ+CTha7yQ5Sht9+ WhouiyY7/BaDSi8tO4nf9umIcsViWg7ig8cL00+OG0BzZN/XL+W8ynOL1rw8N8tSqAKj06 jHn3owxju+Dla3lvdawr2fDujsSBJ4g= Received: from mail-yw1-f197.google.com (mail-yw1-f197.google.com [209.85.128.197]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-107-xg349MAwP5qwC3CXtlYtmg-1; Sun, 26 Mar 2023 12:49:13 -0400 X-MC-Unique: xg349MAwP5qwC3CXtlYtmg-1 Received: by mail-yw1-f197.google.com with SMTP id 00721157ae682-5425c04765dso65633767b3.0 for ; Sun, 26 Mar 2023 09:49:13 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1679849353; x=1682441353; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=vbKXcmm5r4824ryo074qPzqnawegFkMefVsqGO5Vp1c=; b=1bwqoChoajuqa/9MpGGhEMgmgfgFeuOpgui8AOv69Ge7KlxlWnQkXzYTgliOUzivq+ Me6N/aBWm12DpNR7WoRLb0dVPA9fzsfFd6Fd2sJ9OkBNSgukB/072ikokVVV/RcJ3rM/ WQPDwECNQlR3aAMTwfQkom6KusGWdHgongB12WO96ME+e/o6dpu0D1EGaJrzvXezPcvs o0NCbfNQvoC84IgleVbPwVee8v7Fduuj9JVVNpTDTsYcSab8RXeVcYe7LKmGzWpLWggQ 9WTdB/5fwN3AOD6XaPCK59+HXJ9MjOsSxlF7j6a20C7AesBMqEhVjVNRljrvGYOm7Dqi j4Fg== X-Gm-Message-State: AAQBX9cs8mPJXQ2ipl3KyIciJQZ4CZhmZp3E7t8cBXQ9PuRL+oIk0Ao9 rDSeuTiS2YvFu2GASN26ZA8G2iDyoTR1p024lpBlFK3E3cgf5SvU7I3Oa5b4W5/WszaU46gJuOj oT1JvwumYWtdg2II= X-Received: by 2002:a05:690c:2a09:b0:545:4a47:98d2 with SMTP id ei9-20020a05690c2a0900b005454a4798d2mr8884664ywb.2.1679849352689; Sun, 26 Mar 2023 09:49:12 -0700 (PDT) X-Google-Smtp-Source: AKy350adSKqe7byrWf0HbIRwmHMhwahFlYSsXEBix3KGyrPMV+oYxiIqQOMOh+cc+Uml+BTenJ8chA== X-Received: by 2002:a05:690c:2a09:b0:545:4a47:98d2 with SMTP id ei9-20020a05690c2a0900b005454a4798d2mr8884643ywb.2.1679849352306; Sun, 26 Mar 2023 09:49:12 -0700 (PDT) Received: from x1n (bras-base-aurron9127w-grc-40-70-52-229-124.dsl.bell.ca. [70.52.229.124]) by smtp.gmail.com with ESMTPSA id 135-20020a81078d000000b00545a0818470sm1539797ywh.0.2023.03.26.09.49.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 26 Mar 2023 09:49:11 -0700 (PDT) Date: Sun, 26 Mar 2023 12:49:10 -0400 From: Peter Xu To: "Dr. David Alan Gilbert" Cc: peter.maydell@linaro.org, qemu-devel@nongnu.org, quintela@redhat.com Subject: Re: s390 migration crash Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: Received-SPF: pass client-ip=170.10.129.124; envelope-from=peterx@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org On Wed, Mar 22, 2023 at 03:16:23PM -0400, Peter Xu wrote: > On Wed, Mar 22, 2023 at 06:13:43PM +0000, Dr. David Alan Gilbert wrote: > > * Peter Xu (peterx@redhat.com) wrote: > > > On Wed, Mar 22, 2023 at 02:05:06PM +0000, Dr. David Alan Gilbert wrote: > > > > * Peter Xu (peterx@redhat.com) wrote: > > > > > On Tue, Mar 21, 2023 at 08:24:37PM +0000, Dr. David Alan Gilbert wrote: > > > > > > Hi Peter's, > > > > > > Peter M pointed me to a seg in a migration test in CI; I can reproduce > > > > > > it: > > > > > > * On an s390 host > > > > > > > > > > How easy to reproduce? > > > > > > > > Pretty much every time when run as: > > > > make check -j 4 > > > > > > > > > > * only as part of a make check - running migration-test by itself > > > > > > doesn't trigger for me. > > > > > > * It looks like it's postcopy preempt > > > > > > > > > > > > (gdb) bt full > > > > > > #0 iov_size (iov=iov@entry=0x2aa00e60670, iov_cnt=) at ../util/iov.c:88 > > > > > > len = 13517923312037845750 > > > > > > i = 17305 > > > > > > #1 0x000002aa004d068c in qemu_fflush (f=0x2aa00e58630) at ../migration/qemu-file.c:307 > > > > > > local_error = 0x0 > > > > > > #2 0x000002aa004d0e04 in qemu_fflush (f=) at ../migration/qemu-file.c:297 > > > > > > #3 0x000002aa00613962 in postcopy_preempt_shutdown_file (s=s@entry=0x2aa00d1b4e0) at ../migration/ram.c:4657 > > > > > > #4 0x000002aa004e12b4 in migration_completion (s=0x2aa00d1b4e0) at ../migration/migration.c:3469 > > > > > > ret = > > > > > > current_active_state = 5 > > > > > > must_precopy = 0 > > > > > > can_postcopy = 0 > > > > > > in_postcopy = true > > > > > > pending_size = 0 > > > > > > __func__ = "migration_iteration_run" > > > > > > iter_state = > > > > > > s = 0x2aa00d1b4e0 > > > > > > thread = > > > > > > setup_start = > > > > > > thr_error = > > > > > > urgent = > > > > > > #5 migration_iteration_run (s=0x2aa00d1b4e0) at ../migration/migration.c:3882 > > > > > > must_precopy = 0 > > > > > > can_postcopy = 0 > > > > > > in_postcopy = true > > > > > > pending_size = 0 > > > > > > __func__ = "migration_iteration_run" > > > > > > iter_state = > > > > > > s = 0x2aa00d1b4e0 > > > > > > thread = > > > > > > setup_start = > > > > > > thr_error = > > > > > > urgent = > > > > > > #6 migration_thread (opaque=opaque@entry=0x2aa00d1b4e0) at ../migration/migration.c:4124 > > > > > > iter_state = > > > > > > s = 0x2aa00d1b4e0 > > > > > > --Type for more, q to quit, c to continue without paging-- > > > > > > thread = > > > > > > setup_start = > > > > > > thr_error = > > > > > > urgent = > > > > > > #7 0x000002aa00819b8c in qemu_thread_start (args=) at ../util/qemu-thread-posix.c:541 > > > > > > __cancel_buf = > > > > > > {__cancel_jmp_buf = {{__cancel_jmp_buf = {{__gregs = {4396782422080, 4393751543808, 4397299389454, 4396844235904, 2929182727824, 2929182933488, 4396843986792, 4397299389455, 33679382915066768, 33678512846981306}, __fpregs = {4396774031360, 8392704, 2929182933488, 0, 4396782422272, 2929172491858, 4396774031360, 1}}}, __mask_was_saved = 0}}, __pad = {0x3ffb4a77a60, 0x0, 0x0, 0x0}} > > > > > > __cancel_routine = 0x2aa00819bf0 > > > > > > __not_first_call = > > > > > > start_routine = 0x2aa004e08f0 > > > > > > arg = 0x2aa00d1b4e0 > > > > > > r = > > > > > > #8 0x000003ffb7b1e2e6 in start_thread () at /lib64/libc.so.6 > > > > > > #9 0x000003ffb7aafdbe in thread_start () at /lib64/libc.so.6 > > > > > > > > > > > > It looks like it's in the preempt test: > > > > > > > > > > > > (gdb) where > > > > > > #0 0x000003ffb17a0126 in __pthread_kill_implementation () from /lib64/libc.so.6 > > > > > > #1 0x000003ffb1750890 in raise () from /lib64/libc.so.6 > > > > > > #2 0x000003ffb172a340 in abort () from /lib64/libc.so.6 > > > > > > #3 0x000002aa0041c130 in qtest_check_status (s=) at ../tests/qtest/libqtest.c:194 > > > > > > #4 0x000003ffb1a3b5de in g_hook_list_invoke () from /lib64/libglib-2.0.so.0 > > > > > > #5 > > > > > > #6 0x000003ffb17a0126 in __pthread_kill_implementation () from /lib64/libc.so.6 > > > > > > #7 0x000003ffb1750890 in raise () from /lib64/libc.so.6 > > > > > > #8 0x000003ffb172a340 in abort () from /lib64/libc.so.6 > > > > > > #9 0x000002aa00420318 in qmp_fd_receive (fd=) at ../tests/qtest/libqmp.c:80 > > > > > > #10 0x000002aa0041d5ee in qtest_qmp_receive_dict (s=0x2aa01eb2700) at ../tests/qtest/libqtest.c:713 > > > > > > #11 qtest_qmp_receive (s=0x2aa01eb2700) at ../tests/qtest/libqtest.c:701 > > > > > > #12 qtest_vqmp (s=s@entry=0x2aa01eb2700, fmt=fmt@entry=0x2aa00487100 "{ 'execute': 'query-migrate' }", ap=ap@entry=0x3ffc247cc68) > > > > > > at ../tests/qtest/libqtest.c:765 > > > > > > #13 0x000002aa00413f1e in wait_command (who=who@entry=0x2aa01eb2700, command=command@entry=0x2aa00487100 "{ 'execute': 'query-migrate' }") > > > > > > at ../tests/qtest/migration-helpers.c:73 > > > > > > #14 0x000002aa00414078 in migrate_query (who=who@entry=0x2aa01eb2700) at ../tests/qtest/migration-helpers.c:139 > > > > > > #15 migrate_query_status (who=who@entry=0x2aa01eb2700) at ../tests/qtest/migration-helpers.c:161 > > > > > > #16 0x000002aa00414480 in check_migration_status (ungoals=0x0, goal=0x2aa00495c7e "completed", who=0x2aa01eb2700) at ../tests/qtest/migration-helpers.c:177 > > > > > > #17 wait_for_migration_status (who=0x2aa01eb2700, goal=, ungoals=0x0) at ../tests/qtest/migration-helpers.c:202 > > > > > > #18 0x000002aa0041300e in migrate_postcopy_complete (from=from@entry=0x2aa01eb2700, to=to@entry=0x2aa01eb3000, args=args@entry=0x3ffc247cf48) > > > > > > at ../tests/qtest/migration-test.c:1137 > > > > > > #19 0x000002aa004131a4 in test_postcopy_common (args=0x3ffc247cf48) at ../tests/qtest/migration-test.c:1162 > > > > > > #20 test_postcopy_preempt () at ../tests/qtest/migration-test.c:1178 > > > > > > > > > > > > Looking at the iov and file it's garbage; so it makes me think this is > > > > > > something like a flush on a closed file. > > > > > > > > > > I didn't figure out how that could be closed, but I think there's indeed a > > > > > possible race that the qemufile can be accessed by both the return path > > > > > thread and the migration thread concurrently, while qemufile is not thread > > > > > safe on that. > > > > > > > > > > What postcopy_preempt_shutdown_file() does was: the src uses this EOS to > > > > > kick the dest QEMU preempt thread out of the migration and shut it off. > > > > > After some thought I think this is unnecessary complexity, since postcopy > > > > > should end at the point where dest received all the data, then it sends a > > > > > SHUT to src. So potentially it's not good to have dest relying on anything > > > > > from src to shutdown anything (the preempt thread here) because it's the > > > > > dest qemu that makes the final decision to finish. Ideally the preempt > > > > > thread on dest should be able to shutdown itself. > > > > > > > > > > The trick here is preempt thread will block at read() (aka, recvmsg()) at > > > > > the channel at that time and the only way to kick it out from that is a > > > > > shutdown() on dest. I attached a patch did it. I'm not 100% sure whether > > > > > it'll already resolve our problem but worth trying. This also made me > > > > > notice we forgot to enable SHUTDOWN feature on tls server when I was > > > > > running the patch 1 with qtest, so two patches needed. > > > > > > > > Well, it seems to fix it: > > > > > > > > Tested-by: Dr. David Alan Gilbert > > > > > > > > Will this break interaction with an old qemu that's still waiting for > > > > the EOS? > > > > > > Right.. when someone migrates from a 8.0+ QEMU (assuming this can land in > > > this release) to 7.2 QEMU with postcopy+preempt enabled. > > > > So why not carry on sending the EOS? > > Because it's still racy to send it so potentially such a migration is buggy > and can crash? Or maybe you meant we could serialize the sends (e.g. with > some locks) so we keep the send but avoid the crash (hopefully!). Then the > lock will only be used on old qemu binaries. > > > > > > I still see preempt full support only in 8.0; we did major rework in 8.0 to > > > shift to the return path. So maybe it's worth the risk? I can also add a > > > flag for this but it may add maintainance burden in general OTOH. > > > > If you're saying the pre 8.0 never worked then I guess that's fine; > > I think pre-8.0 works too. The rework shouldn't have changed that fact or > the stream protocol. > > > it's tricky from say a libvirt or higher level tool to know what it can > > use. > > Libvirt hasn't yet support preempt mode, but yeah it'll be a problem too in > the futhre. > > So maybe I introduce a flag, send this EOS conditionally, but ignoring the > possibility of crash? Here's a follow up on this one. I think we need to have this resolved sooner or later, if we want this for 8.0 then we'll need it fast, because at least the current solution will break compatibility. We can also postpone the fix later so no rush for 8.0, but unfortunately when I was testing this patch (with a new internal flag for pre-7.2 as addition) I found that qemu already broke with preempt 8.0 -> 7.2-bin because of the recent rework on channel reorders... that changed when the src/dst will establish the preempt channel to avoid race conditions, however it also means a migration from 8.0 -> 7.2 will hang because 7.2 will try to wait for preempt channel while 8.0 will not establish that, not until switching to postcopy. :( So, here's my take on a whole solution of the issues: we try to not only fixup this rare hang with the additional flag, also use that additional flag to fix the breakage of above. Luckily we're still in 8.0 window so we still have chance to fix that. Let's see whether we can still make it. In all cases, I'll prepare patches today very soon and marking it for 8.0 material. I'll test 8.0 <-> 7.2 migrations too with preempt enabled, then we'll move discussion there. Thanks, -- Peter Xu