qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: Fabiano Rosas <farosas@suse.de>
Cc: "Zhijian Li (Fujitsu)" <lizhijian@fujitsu.com>,
	Li Zhijian via <qemu-devel@nongnu.org>,
	Laurent Vivier <lvivier@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: [PATCH 2/2] [NOT-FOR-MERGE] Add qtest for migration over RDMA
Date: Wed, 19 Feb 2025 09:11:30 -0500	[thread overview]
Message-ID: <Z7Xmkq0nTmZ8TRXU@x1.local> (raw)
In-Reply-To: <87tt8q8416.fsf@suse.de>

On Wed, Feb 19, 2025 at 10:20:21AM -0300, Fabiano Rosas wrote:
> Peter Xu <peterx@redhat.com> writes:
> 
> > On Wed, Feb 19, 2025 at 05:33:26AM +0000, Zhijian Li (Fujitsu) wrote:
> >> 
> >> 
> >> On 19/02/2025 06:40, Peter Xu wrote:
> >> > On Tue, Feb 18, 2025 at 06:03:48PM -0300, Fabiano Rosas wrote:
> >> >> Li Zhijian via <qemu-devel@nongnu.org> writes:
> >> >>
> >> >>> This qtest requirs there is RXE link in the host.
> >> >>>
> >> >>> Here is an example to show how to add this RXE link:
> >> >>> $ ./new-rdma-link.sh
> >> >>> 192.168.22.93
> >> >>>
> >> >>> Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
> >> >>> ---
> >> >>> The RDMA migration was broken again...due to lack of sufficient test/qtest.
> >> >>>
> >> >>> It's urgly to add and execute a script to establish an RDMA link in
> >> >>> the C program. If anyone has a better suggestion, please let me know.
> >> >>>
> >> >>> $ cat ./new-rdma-link.sh
> >> >>> get_ipv4_addr() {
> >> >>>          ip -4 -o addr show dev "$1" |
> >> >>>                  sed -n 's/.*[[:blank:]]inet[[:blank:]]*\([^[:blank:]/]*\).*/\1/p'
> >> >>> }
> >> >>>
> >> >>> has_soft_rdma() {
> >> >>>          rdma link | grep -q " netdev $1[[:blank:]]*\$"
> >> >>> }
> >> >>>
> >> >>> start_soft_rdma() {
> >> >>>          local type
> >> >>>
> >> >>>          modprobe rdma_rxe || return $?
> >> >>>          type=rxe
> >> >>>          (
> >> >>>                  cd /sys/class/net &&
> >> >>>                          for i in *; do
> >> >>>                                  [ -e "$i" ] || continue
> >> >>>                                  [ "$i" = "lo" ] && continue
> >> >>>                                  [ "$(<"$i/addr_len")" = 6 ] || continue
> >> >>>                                  [ "$(<"$i/carrier")" = 1 ] || continue
> >> >>>                                  has_soft_rdma "$i" && break
> >> >>>                                  rdma link add "${i}_$type" type $type netdev "$i" && break
> >> >>>                          done
> >> >>>                  has_soft_rdma "$i" && echo $i
> >> >>>          )
> >> >>>
> >> >>> }
> >> >>>
> >> >>> rxe_link=$(start_soft_rdma)
> >> >>> [[ "$rxe_link" ]] && get_ipv4_addr $rxe_link
> >> >>>
> >> >>> Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
> >> >>> ---
> >> >>>   tests/qtest/migration/new-rdma-link.sh |  34 ++++++++
> >> >>>   tests/qtest/migration/precopy-tests.c  | 103 +++++++++++++++++++++++++
> >> >>>   2 files changed, 137 insertions(+)
> >> >>>   create mode 100644 tests/qtest/migration/new-rdma-link.sh
> >> >>>
> >> >>> diff --git a/tests/qtest/migration/new-rdma-link.sh b/tests/qtest/migration/new-rdma-link.sh
> >> >>> new file mode 100644
> >> >>> index 00000000000..ca20594eaae
> >> >>> --- /dev/null
> >> >>> +++ b/tests/qtest/migration/new-rdma-link.sh
> >> >>> @@ -0,0 +1,34 @@
> >> >>> +#!/bin/bash
> >> >>> +
> >> >>> +# Copied from blktests
> >> >>> +get_ipv4_addr() {
> >> >>> +	ip -4 -o addr show dev "$1" |
> >> >>> +		sed -n 's/.*[[:blank:]]inet[[:blank:]]*\([^[:blank:]/]*\).*/\1/p'
> >> >>> +}
> >> >>> +
> >> >>> +has_soft_rdma() {
> >> >>> +	rdma link | grep -q " netdev $1[[:blank:]]*\$"
> >> >>> +}
> >> >>> +
> >> >>> +start_soft_rdma() {
> >> >>> +	local type
> >> >>> +
> >> >>> +	modprobe rdma_rxe || return $?
> >> >>> +	type=rxe
> >> >>> +	(
> >> >>> +		cd /sys/class/net &&
> >> >>> +			for i in *; do
> >> >>> +				[ -e "$i" ] || continue
> >> >>> +				[ "$i" = "lo" ] && continue
> >> >>> +				[ "$(<"$i/addr_len")" = 6 ] || continue
> >> >>> +				[ "$(<"$i/carrier")" = 1 ] || continue
> >> >>> +				has_soft_rdma "$i" && break
> >> >>> +				rdma link add "${i}_$type" type $type netdev "$i" && break
> >> >>> +			done
> >> >>> +		has_soft_rdma "$i" && echo $i
> >> >>> +	)
> >> >>> +
> >> >>> +}
> >> >>> +
> >> >>> +rxe_link=$(start_soft_rdma)
> >> >>> +[[ "$rxe_link" ]] && get_ipv4_addr $rxe_link
> >> >>> diff --git a/tests/qtest/migration/precopy-tests.c b/tests/qtest/migration/precopy-tests.c
> >> >>> index 162fa695318..d2a1c9c9438 100644
> >> >>> --- a/tests/qtest/migration/precopy-tests.c
> >> >>> +++ b/tests/qtest/migration/precopy-tests.c
> >> >>> @@ -98,6 +98,105 @@ static void test_precopy_unix_dirty_ring(void)
> >> >>>       test_precopy_common(&args);
> >> >>>   }
> >> >>>   
> >> >>> +static int new_rdma_link(char *buffer) {
> >> >>> +    // Copied from blktests
> >> >>> +    const char *script =
> >> >>> +        "#!/bin/bash\n"
> >> >>> +        "\n"
> >> >>> +        "get_ipv4_addr() {\n"
> >> >>> +        "    ip -4 -o addr show dev \"$1\" |\n"
> >> >>> +        "    sed -n 's/.*[[:blank:]]inet[[:blank:]]*\\([^[:blank:]/]*\\).*/\\1/p'\n"
> >> >>> +        "}\n"
> >> >>> +        "\n"
> >> >>> +        "has_soft_rdma() {\n"
> >> >>> +        "    rdma link | grep -q \" netdev $1[[:blank:]]*\\$\"\n"
> >> >>> +        "}\n"
> >> >>> +        "\n"
> >> >>> +        "start_soft_rdma() {\n"
> >> >>> +        "    local type\n"
> >> >>> +        "\n"
> >> >>> +        "    modprobe rdma_rxe || return $?\n"
> >> >>> +        "    type=rxe\n"
> >> >>> +        "    (\n"
> >> >>> +        "        cd /sys/class/net &&\n"
> >> >>> +        "        for i in *; do\n"
> >> >>> +        "            [ -e \"$i\" ] || continue\n"
> >> >>> +        "            [ \"$i\" = \"lo\" ] && continue\n"
> >> >>> +        "            [ \"$(<$i/addr_len)\" = 6 ] || continue\n"
> >> >>> +        "            [ \"$(<$i/carrier)\" = 1 ] || continue\n"
> >> >>> +        "            has_soft_rdma \"$i\" && break\n"
> >> >>> +        "            rdma link add \"${i}_$type\" type $type netdev \"$i\" && break\n"
> >> >>> +        "        done\n"
> >> >>> +        "        has_soft_rdma \"$i\" && echo $i\n"
> >> >>> +        "    )\n"
> >> >>> +        "}\n"
> >> >>> +        "\n"
> >> >>> +        "rxe_link=$(start_soft_rdma)\n"
> >> >>> +        "[[ \"$rxe_link\" ]] && get_ipv4_addr $rxe_link\n";
> >> >>> +
> >> >>> +    char script_filename[] = "/tmp/temp_scriptXXXXXX";
> >> >>> +    int fd = mkstemp(script_filename);
> >> >>> +    if (fd == -1) {
> >> >>> +        perror("Failed to create temporary file");
> >> >>> +        return 1;
> >> >>> +    }
> >> >>> +
> >> >>> +    FILE *fp = fdopen(fd, "w");
> >> >>> +    if (fp == NULL) {
> >> >>> +        perror("Failed to open file stream");
> >> >>> +        close(fd);
> >> >>> +        return 1;
> >> >>> +    }
> >> >>> +    fprintf(fp, "%s", script);
> >> >>> +    fclose(fp);
> >> >>> +
> >> >>> +    if (chmod(script_filename, 0700) == -1) {
> >> >>> +        perror("Failed to set execute permission");
> >> >>> +        return 1;
> >> >>> +    }
> >> >>> +
> >> >>> +    FILE *pipe = popen(script_filename, "r");
> >> >>> +    if (pipe == NULL) {
> >> >>> +        perror("Failed to run script");
> >> >>> +        return 1;
> >> >>> +    }
> >> >>> +
> >> >>> +    int idx = 0;
> >> >>> +    while (fgets(buffer + idx, 128 - idx, pipe) != NULL) {
> >> >>> +        idx += strlen(buffer);
> >> >>> +    }
> >> >>> +    if (buffer[idx - 1] == '\n')
> >> >>> +        buffer[idx - 1] = 0;
> >> >>> +
> >> >>> +    int status = pclose(pipe);
> >> >>> +    if (status == -1) {
> >> >>> +        perror("Error reported by pclose()");
> >> >>> +    } else if (!WIFEXITED(status)) {
> >> >>> +        printf("Script did not terminate normally\n");
> >> >>> +    }
> >> >>> +
> >> >>> +    remove(script_filename);
> >> > 
> >> > The script can be put separately instead if hard-coded here, right?
> >> 
> >> 
> >> Sure, If so, I wonder whether the migration-test program is able to know where is this script?
> >> 
> >> 
> >> > 
> >> >>> +
> >> >>> +    return 0;
> >> >>> +}
> >> >>> +
> >> >>> +static void test_precopy_rdma_plain(void)
> >> >>> +{
> >> >>> +    char buffer[128] = {};
> >> >>> +
> >> >>> +    if (new_rdma_link(buffer))
> >> >>> +        return;
> >> >>> +
> >> >>> +    g_autofree char *uri = g_strdup_printf("rdma:%s:7777", buffer);
> >> >>> +
> >> >>> +    MigrateCommon args = {
> >> >>> +        .listen_uri = uri,
> >> >>> +        .connect_uri = uri,
> >> >>> +    };
> >> >>> +
> >> >>> +    test_precopy_common(&args);
> >> >>> +}
> >> >>> +
> >> >>>   static void test_precopy_tcp_plain(void)
> >> >>>   {
> >> >>>       MigrateCommon args = {
> >> >>> @@ -968,6 +1067,10 @@ static void migration_test_add_precopy_smoke(MigrationTestEnv *env)
> >> >>>                          test_multifd_tcp_uri_none);
> >> >>>       migration_test_add("/migration/multifd/tcp/plain/cancel",
> >> >>>                          test_multifd_tcp_cancel);
> >> >>> +#ifdef CONFIG_RDMA
> >> >>> +    migration_test_add("/migration/precopy/rdma/plain",
> >> >>> +                       test_precopy_rdma_plain);
> >> >>> +#endif
> >> >>>   }
> >> >>>   
> >> >>>   void migration_test_add_precopy(MigrationTestEnv *env)
> >> >>
> >> >> Thanks, that's definitely better than nothing. I'll experiment with this
> >> >> locally, see if I can at least run it before sending a pull request.
> >> > 
> >> > With your newly added --full, IIUC we can add whatever we want there.
> >> > E.g. we can add --rdma and iff specified, migration-test adds the rdma test.
> >> > 
> >> > Or.. skip the test when the rdma link isn't available.
> >> > 
> >> > If we could separate the script into a file, it'll be better.  We could
> >> > create scripts/migration dir and put all migration scripts over there,
> >> 
> >> We have any other existing script? I didn't find it in current QEMU tree.
> >
> > We have a few that I'm aware of:
> >
> >   - analyze-migration.py
> >   - vmstate-static-checker.py
> >   - userfaultfd-wrlat.py
> >
> 
> If it cannot be reached from there for some reason, we could copy it to
> build/tests/qtest/migration during the build. As a last resort I'm fine
> with just having it directly at tests/qtest/migration like this patch
> does.

Yes, if we want to have the test being able to trigger the script, we can
put it under tests/qtest/migration/.

> 
> >> 
> >> 
> >> > then
> >> > in the test it tries to detect rdma link and fetch the ip only
> >> 
> >> It should work without root permission if we just *detect* and *fetch ip*.
> >> 
> >> Do you also mean we can split new-rdma-link.sh to 2 separate scripts
> >> - add-rdma-link.sh # optionally, execute by user before the test (require root permission)
> >> - detect-fetch-rdma.sh # execute from the migration-test
> >
> > Hmm indeed we still need a script to scan over all the ports..
> >
> > If having --rdma is a good idea, maybe we can further make it a parameter
> > to --rdma?
> >
> >   $ migration-test --rdma $RDMA_IP
> >
> > Or:
> >
> >   $ migration-test --rdma-ip $RDMA_IP
> 
> I think --rdma only makes sense if it's going to do something
> special. The optmimal scenario is that it always runs the test when it
> can and sets up/tears down anything it needs.
> 
> If it needs root, I'd prefer the test informs about this and does the
> work itself.
> 
> It would also be good to have the add + detect separate so we have more
> flexibility, maybe we manage to enable this in CI even.
> 
> So:
> 
> ./add.sh
> migration-test
> (runs detect.sh + runs rdma test)
> (leaves stuff behind)
> 
> migration-test
> (skips rdma test with message that it needs root)
> 
> sudo migration-test
> (runs add.sh + detect.sh + runs rdma test)
> (cleans itself up)
> 
> Does that make sense to you? I hope it's not too much work.

Looks good here.  We can also keep all the rdma stuff into one file, taking
parameters.

./rdma-helper.sh setup
./rdma-helper.sh detect-ip

> 
> If you'd like to limit the usage of sudo for running the tests, then we
> could indeed add the --rdma option and this would be even more
> strict. The good thing of not having --rdma is that I could call add.sh
> and then run the full make check afterwards, but that's not a huge deal.
> 
> > Then maybe migration-test can directly take that IP and run the tests,
> > assuming the admin setup the rdma link.  Then we keep that one script.
> >
> > Or I assume it's still ok that the test requires root only for --rdma, then
> > invoke the script directly in the test.  If so, we'd better also remove the
> > rdma link after test finished, so no side effect of the test (modprobe is
> > probably fine).
> >
> > We can wait and see how far Fabiano went with this, and also his opinion.
> 
> I haven't got the chance to try the script yet. I still need to figure
> out what packages I need from the distro.

For misterious reasons I seem to have all the libs needed, probably I tried
to build RDMA at some point and do something with it.  So I gave it a shot
quickly, I can reproduce Zhijian's failure, and patch 1 fixes it.

-- 
Peter Xu



  reply	other threads:[~2025-02-19 14:12 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-18  7:43 [PATCH 1/2] migration: Prioritize RDMA in ram_save_target_page() Li Zhijian via
2025-02-18  7:43 ` [PATCH 2/2] [NOT-FOR-MERGE] Add qtest for migration over RDMA Li Zhijian via
2025-02-18 21:03   ` Fabiano Rosas
2025-02-18 22:40     ` Peter Xu
2025-02-19  5:33       ` Zhijian Li (Fujitsu) via
2025-02-19 12:47         ` Peter Xu
2025-02-19 13:20           ` Fabiano Rosas
2025-02-19 14:11             ` Peter Xu [this message]
2025-02-20  9:40               ` Li Zhijian via
2025-02-20 15:55                 ` Peter Xu
2025-02-21  1:32                   ` Zhijian Li (Fujitsu) via
2025-02-18 20:30 ` [PATCH 1/2] migration: Prioritize RDMA in ram_save_target_page() Fabiano Rosas
2025-02-18 22:03   ` Peter Xu
2025-02-19  9:39     ` Zhijian Li (Fujitsu) via
2025-02-19 13:23       ` Peter Xu
2025-02-20  1:21         ` Zhijian Li (Fujitsu) via

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z7Xmkq0nTmZ8TRXU@x1.local \
    --to=peterx@redhat.com \
    --cc=farosas@suse.de \
    --cc=lizhijian@fujitsu.com \
    --cc=lvivier@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).