qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Thomas Huth <thuth@redhat.com>
To: Nicholas Piggin <npiggin@gmail.com>, qemu-devel@nongnu.org
Cc: "Paolo Bonzini" <pbonzini@redhat.com>,
	"Peter Xu" <peterx@redhat.com>,
	"David Hildenbrand" <david@redhat.com>,
	"Philippe Mathieu-Daudé" <philmd@linaro.org>,
	"Dr . David Alan Gilbert" <dgilbert@redhat.com>,
	"Juan Quintela" <quintela@redhat.com>
Subject: Re: [PATCH] system/physmem: Fix migration dirty bitmap coherency with TCG memory access
Date: Thu, 22 Feb 2024 21:59:36 +0100	[thread overview]
Message-ID: <f9fe86e0-e562-45d5-a4cc-aa0052ef5368@redhat.com> (raw)
In-Reply-To: <CZ9I9VE1A542.30BIYSXFQT963@wheely>

On 20/02/2024 02.13, Nicholas Piggin wrote:
> On Tue Feb 20, 2024 at 12:10 AM AEST, Thomas Huth wrote:
>> On 19/02/2024 07.17, Nicholas Piggin wrote:
>>> The fastpath in cpu_physical_memory_sync_dirty_bitmap() to test large
>>> aligned ranges forgot to bring the TCG TLB up to date after clearing
>>> some of the dirty memory bitmap bits. This can result in stores though
>>> the TCG TLB not setting the dirty memory bitmap and ultimately causes
>>> memory corruption / lost updates during migration from a TCG host.
>>>
>>> Fix this by exporting an abstracted function to call when dirty bits
>>> have been cleared.
>>>
>>> Fixes: aa8dc044772 ("migration: synchronize memory bitmap 64bits at a time")
>>> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
>>> ---
>>
>> Sounds promising! ... but it doesn't seem to fix the migration-test qtest
>> with s390x when it gets enabled again:
> 
> Did it fix kvm-unit-tests for you?

It does, indeed! With your QEMU patch here, your new selftest-migration test 
of the k-u-t is working reliably with TCG now, indeed. Thus feel free to add:

Tested-by: Thomas Huth <thuth@redhat.com>

>> diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
>> --- a/tests/qtest/migration-test.c
>> +++ b/tests/qtest/migration-test.c
>> @@ -3385,15 +3385,6 @@ int main(int argc, char **argv)
>>            return g_test_run();
>>        }
>>
>> -    /*
>> -     * Similar to ppc64, s390x seems to be touchy with TCG, so disable it
>> -     * there until the problems are resolved
>> -     */
>> -    if (g_str_equal(arch, "s390x") && !has_kvm) {
>> -        g_test_message("Skipping test: s390x host with KVM is required");
>> -        return g_test_run();
>> -    }
>> -
>>        tmpfs = g_dir_make_tmp("migration-test-XXXXXX", &err);
>>        if (!tmpfs) {
>>            g_test_message("Can't create temporary directory in %s: %s",
>>
>> I wonder whether there is more stuff like this necessary somewhere?
> 
> Possibly. That's what the commit logs for the TCG disable indicate. I
> have found another dirty bitmap TCG race too. I'll send it out after
> some more testing.
> 
>> Did you try to re-enable tests/qtest/migration-test.c for ppc64 with TCG to
>> see whether that works fine now?
> 
> Hmm, I did try and so far ppc64 is not failing even with upstream QEMU.

Oh, indeed! Actually, now that you mentioned it, I remembered that I checked 
it a couple of weeks ago already:

https://lore.kernel.org/qemu-devel/7d4f5624-83d2-4330-9315-b23869529e99@redhat.com/

> I'll try with s390x. Any additional build or runtime options to make it
> break? How long does it take for breakage to be evident?

For me, it normally breaks after running the migration test a couple of few 
times already, let's say one time out of ten runs?

  Thomas



  reply	other threads:[~2024-02-22 21:00 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-19  6:17 [PATCH] system/physmem: Fix migration dirty bitmap coherency with TCG memory access Nicholas Piggin
2024-02-19 14:10 ` Thomas Huth
2024-02-20  1:13   ` Nicholas Piggin
2024-02-22 20:59     ` Thomas Huth [this message]
2024-02-23  1:10       ` Nicholas Piggin
2024-02-20  3:44   ` Nicholas Piggin
2024-03-12 17:38 ` Thomas Huth
2024-03-12 19:24   ` Peter Xu
2024-03-12 20:16     ` Philippe Mathieu-Daudé

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f9fe86e0-e562-45d5-a4cc-aa0052ef5368@redhat.com \
    --to=thuth@redhat.com \
    --cc=david@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=npiggin@gmail.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=philmd@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).