qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: Fabiano Rosas <farosas@suse.de>
Cc: Peter Maydell <peter.maydell@linaro.org>, qemu-devel@nongnu.org
Subject: Re: [PULL 06/15] tests/qtest/migration: Don't use -cpu max for aarch64
Date: Wed, 31 Jan 2024 12:04:46 +0800	[thread overview]
Message-ID: <ZbnG3qkMBPdsQxan@x1n> (raw)
In-Reply-To: <87wmrqbjnl.fsf@suse.de>

On Tue, Jan 30, 2024 at 06:23:10PM -0300, Fabiano Rosas wrote:
> Peter Xu <peterx@redhat.com> writes:
> 
> > On Tue, Jan 30, 2024 at 10:18:07AM +0000, Peter Maydell wrote:
> >> On Mon, 29 Jan 2024 at 23:31, Fabiano Rosas <farosas@suse.de> wrote:
> >> >
> >> > Fabiano Rosas <farosas@suse.de> writes:
> >> >
> >> > > Peter Xu <peterx@redhat.com> writes:
> >> > >
> >> > >> On Fri, Jan 26, 2024 at 11:54:32AM -0300, Fabiano Rosas wrote:
> >> > > The issue that occurs to me now is that 'cpu host' will not work with
> >> > > TCG. We might actually need to go poking /dev/kvm for this to work.
> >> >
> >> > Nevermind this last part. There's not going to be a scenario where we
> >> > build with CONFIG_KVM, but run in an environment that does not support
> >> > KVM.
> >> 
> >> Yes, there is. We'll build with CONFIG_KVM on any aarch64 host,
> >> but that doesn't imply that the user running the build and
> >> test has permissions for /dev/kvm.
> >
> > I'm actually pretty confused on why this would be a problem even for
> > neoverse-n1: can we just try to use KVM, if it fails then use TCG?
> > Something like:
> >
> >   (construct qemu cmdline)
> >   ..
> > #ifdef CONFIG_KVM
> 
> >   "-accel kvm "
> > #endif
> >   "-accel tcg "
> >   ..
> >
> > ?
> > IIUC if we specify two "-accel", we'll try the first, then if failed then
> > the 2nd?
> 
> Aside from '-cpu max', there's no -accel and -cpu combination that works
> on all of:
> 
> x86_64 host - TCG-only
> aarch64 host - KVM & TCG
> aarch64 host with --disable-tcg - KVM-only
> aarch64 host without access to /dev/kvm - TCG-only
> 
> And the cpus are:
> host - KVM-only
> neoverse-n1 - TCG-only
> 
> We'll need something like:
> 
> /* covers aarch64 host with --disable-tcg */
> if (qtest_has_accel("kvm") && !qtest_has_accel("tcg")) {
>    if (open("/dev/kvm", O_RDONLY) < 0) {
>        g_test_skip()
>    } else {
>        "-accel kvm -cpu host"
>    }
> }
> 
> /* covers x86_64 host */
> if (!qtest_has_accel("kvm") && qtest_has_accel("tcg")) {
>    "-accel tcg -cpu neoverse-n1"
> }
> 
> /* covers aarch64 host */
> if (qtest_has_accel("kvm") && qtest_has_accel("tcg")) {
>    if (open("/dev/kvm", O_RDONLY) < 0) {
>       "-accel tcg -cpu neoverse-n1"
>    } else {
>       "-accel kvm -cpu host"
>    }
> }

The open("/dev/kvm") logic more or less duplicates what QEMU already does
when init accelerators:

    if (!qemu_opts_foreach(qemu_find_opts("accel"),
                           do_configure_accelerator, &init_failed, &error_fatal)) {
        if (!init_failed) {
            error_report("no accelerator found");
        }
        exit(1);
    }

If /dev/kvm not accessible I think it'll already fallback to tcg here, as
do_configure_accelerator() for kvm will just silently fail for qtest.  I
hope we can still rely on that for /dev/kvm access issues.

Hmm, I just notice that test_migrate_start() already has this later:

        "-accel kvm%s -accel tcg "

So we're actually good from that regard, AFAIU.

Then did I understand it right that in the failure case KVM is properly
initialized, however it crashed later in neoverse-n1 asking for TCG?  So
the logic in the accel code above didn't really work to do a real fallback?
A backtrace of such crash would help, maybe; I tried to find it in the
pipeline log but I can only see:

  ----------------------------------- stderr -----------------------------------
  Broken pipe
  ../tests/qtest/libqtest.c:195: kill_qemu() tried to terminate QEMU process but encountered exit status 1 (expected 0)

Or, is there some aarch64 cpu that will have a stable CPU ABI (not like
-max, which is unstable), meanwhile supports both TCG + KVM?

Another thing I noticed that we may need to be caution is that currently
gic is also using max version:

        machine_opts = "gic-version=max";

We may want to choose a sane version too, probably altogether with the
patch?

-- 
Peter Xu



  reply	other threads:[~2024-01-31  4:05 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-26  4:17 [PULL 00/15] Migration 20240126 patches peterx
2024-01-26  4:17 ` [PULL 01/15] userfaultfd: use 1ULL to build ioctl masks peterx
2024-01-26  4:17 ` [PULL 02/15] migration: Plug memory leak on HMP migrate error path peterx
2024-01-26  4:17 ` [PULL 03/15] migration: Make threshold_size an uint64_t peterx
2024-01-26  4:17 ` [PULL 04/15] migration: Drop unnecessary check in ram's pending_exact() peterx
2024-01-26  4:17 ` [PULL 05/15] analyze-migration.py: Remove trick on parsing ramblocks peterx
2024-01-26  4:17 ` [PULL 06/15] tests/qtest/migration: Don't use -cpu max for aarch64 peterx
2024-01-26 14:36   ` Fabiano Rosas
2024-01-26 14:39     ` Peter Maydell
2024-01-26 14:54       ` Fabiano Rosas
2024-01-29  2:51         ` Peter Xu
2024-01-29 12:14           ` Fabiano Rosas
2024-01-29 23:30             ` Fabiano Rosas
2024-01-30 10:18               ` Peter Maydell
2024-01-30 10:48                 ` Peter Xu
2024-01-30 21:23                   ` Fabiano Rosas
2024-01-31  4:04                     ` Peter Xu [this message]
2024-01-31 13:09                       ` Fabiano Rosas
2024-02-01  2:56                         ` Peter Xu
     [not found]                           ` <87y1c4ib03.fsf@suse.de>
2024-02-01 23:50                             ` Peter Xu
2024-02-02 10:51                               ` Peter Maydell
2024-02-05  2:56                                 ` Peter Xu
2024-02-12 18:29                                   ` Peter Maydell
2024-02-05  8:35                                 ` Eric Auger
2024-01-26  4:17 ` [PULL 07/15] ci: Add a migration compatibility test job peterx
2024-01-26  4:17 ` [PULL 08/15] ci: Disable migration compatibility tests for aarch64 peterx
2024-01-26  4:17 ` [PULL 09/15] migration/yank: Use channel features peterx
2024-01-26  4:17 ` [PULL 10/15] migration: Fix use-after-free of migration state object peterx
2024-01-26  4:17 ` [PULL 11/15] migration: Take reference to migration state around bg_migration_vm_start_bh peterx
2024-01-26  4:17 ` [PULL 12/15] migration: Reference migration state around loadvm_postcopy_handle_run_bh peterx
2024-01-26  4:17 ` [PULL 13/15] migration: Add a wrapper to qemu_bh_schedule peterx
2024-01-26  4:17 ` [PULL 14/15] migration: Centralize BH creation and dispatch peterx
2024-01-26  4:17 ` [PULL 15/15] Make 'uri' optional for migrate QAPI peterx
2024-01-26 18:16 ` [PULL 00/15] Migration 20240126 patches Peter Maydell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZbnG3qkMBPdsQxan@x1n \
    --to=peterx@redhat.com \
    --cc=farosas@suse.de \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).