From: Peter Xu <peterx@redhat.com>
To: Fabiano Rosas <farosas@suse.de>
Cc: qemu-devel@nongnu.org, berrange@redhat.com, armbru@redhat.com,
Claudio Fontana <cfontana@suse.de>, Jim Fehlig <jfehlig@suse.com>
Subject: Re: [PATCH v2 11/18] migration/multifd: Add direct-io support
Date: Thu, 30 May 2024 17:35:22 -0400 [thread overview]
Message-ID: <ZljxGhSFhMFKt584@x1n> (raw)
In-Reply-To: <20240523190548.23977-12-farosas@suse.de>
On Thu, May 23, 2024 at 04:05:41PM -0300, Fabiano Rosas wrote:
> When multifd is used along with mapped-ram, we can take benefit of a
> filesystem that supports the O_DIRECT flag and perform direct I/O in
> the multifd threads. This brings a significant performance improvement
> because direct-io writes bypass the page cache which would otherwise
> be thrashed by the multifd data which is unlikely to be needed again
> in a short period of time.
>
> To be able to use a multifd channel opened with O_DIRECT, we must
> ensure that a certain aligment is used. Filesystems usually require a
> block-size alignment for direct I/O. The way to achieve this is by
> enabling the mapped-ram feature, which already aligns its I/O properly
> (see MAPPED_RAM_FILE_OFFSET_ALIGNMENT at ram.c).
>
> By setting O_DIRECT on the multifd channels, all writes to the same
> file descriptor need to be aligned as well, even the ones that come
> from outside multifd, such as the QEMUFile I/O from the main migration
> code. This makes it impossible to use the same file descriptor for the
> QEMUFile and for the multifd channels. The various flags and metadata
> written by the main migration code will always be unaligned by virtue
> of their small size. To workaround this issue, we'll require a second
> file descriptor to be used exclusively for direct I/O.
>
> The second file descriptor can be obtained by QEMU by re-opening the
> migration file (already possible), or by being provided by the user or
> management application (support to be added in future patches).
>
> Signed-off-by: Fabiano Rosas <farosas@suse.de>
> ---
> migration/file.c | 31 ++++++++++++++++++++++++++-----
> migration/file.h | 1 -
> migration/migration.c | 23 +++++++++++++++++++++++
> 3 files changed, 49 insertions(+), 6 deletions(-)
>
> diff --git a/migration/file.c b/migration/file.c
> index ba5b5c44ff..ac4d206492 100644
> --- a/migration/file.c
> +++ b/migration/file.c
> @@ -50,12 +50,31 @@ void file_cleanup_outgoing_migration(void)
> outgoing_args.fname = NULL;
> }
>
> +static void file_enable_direct_io(int *flags)
> +{
> +#ifdef O_DIRECT
> + if (migrate_direct_io()) {
> + *flags |= O_DIRECT;
> + }
> +#else
> + /* it should have been rejected when setting the parameter */
> + g_assert_not_reached();
> +#endif
> +}
> +
> bool file_send_channel_create(gpointer opaque, Error **errp)
> {
> QIOChannelFile *ioc;
> int flags = O_WRONLY;
> bool ret = true;
>
> + /*
> + * Attempt to enable O_DIRECT for the secondary channels. These
> + * are used for sending ram pages and writes should be guaranteed
> + * to be aligned to at least page size.
> + */
> + file_enable_direct_io(&flags);
Call this only if enabled? That looks clearer, IMHO:
if (migrate_direct_io()) {
file_enable_direct_io(&flags);
}
Then:
static void file_enable_direct_io(int *flags)
{
#ifdef O_DIRECT
*flags |= O_DIRECT;
#else
/* it should have been rejected when setting the parameter */
g_assert_not_reached();
#endif
}
If you remember we have similar multifd calls, and I hoped all multifd
functions are only invoked when multifd is enabled first. Same thing.
> +
> ioc = qio_channel_file_new_path(outgoing_args.fname, flags, 0, errp);
> if (!ioc) {
> ret = false;
> @@ -116,21 +135,23 @@ static gboolean file_accept_incoming_migration(QIOChannel *ioc,
> return G_SOURCE_REMOVE;
> }
>
> -void file_create_incoming_channels(QIOChannel *ioc, Error **errp)
> +static void file_create_incoming_channels(QIOChannel *ioc, char *filename,
> + Error **errp)
> {
> - int i, fd, channels = 1;
> + int i, channels = 1;
> g_autofree QIOChannel **iocs = NULL;
> + int flags = O_RDONLY;
>
> if (migrate_multifd()) {
> channels += migrate_multifd_channels();
> + file_enable_direct_io(&flags);
Same here.
Other than that looks good.
Thanks,
> }
>
> iocs = g_new0(QIOChannel *, channels);
> - fd = QIO_CHANNEL_FILE(ioc)->fd;
> iocs[0] = ioc;
>
> for (i = 1; i < channels; i++) {
> - QIOChannelFile *fioc = qio_channel_file_new_dupfd(fd, errp);
> + QIOChannelFile *fioc = qio_channel_file_new_path(filename, flags, 0, errp);
>
> if (!fioc) {
> while (i) {
> @@ -170,7 +191,7 @@ void file_start_incoming_migration(FileMigrationArgs *file_args, Error **errp)
> return;
> }
>
> - file_create_incoming_channels(QIO_CHANNEL(fioc), errp);
> + file_create_incoming_channels(QIO_CHANNEL(fioc), filename, errp);
> }
>
> int file_write_ramblock_iov(QIOChannel *ioc, const struct iovec *iov,
> diff --git a/migration/file.h b/migration/file.h
> index 7699c04677..9f71e87f74 100644
> --- a/migration/file.h
> +++ b/migration/file.h
> @@ -20,7 +20,6 @@ void file_start_outgoing_migration(MigrationState *s,
> int file_parse_offset(char *filespec, uint64_t *offsetp, Error **errp);
> void file_cleanup_outgoing_migration(void);
> bool file_send_channel_create(gpointer opaque, Error **errp);
> -void file_create_incoming_channels(QIOChannel *ioc, Error **errp);
> int file_write_ramblock_iov(QIOChannel *ioc, const struct iovec *iov,
> int niov, RAMBlock *block, Error **errp);
> int multifd_file_recv_data(MultiFDRecvParams *p, Error **errp);
> diff --git a/migration/migration.c b/migration/migration.c
> index e1b269624c..e03c80b3aa 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -155,6 +155,16 @@ static bool migration_needs_seekable_channel(void)
> return migrate_mapped_ram();
> }
>
> +static bool migration_needs_extra_fds(void)
> +{
> + /*
> + * When doing direct-io, multifd requires two different,
> + * non-duplicated file descriptors so we can use one of them for
> + * unaligned IO.
> + */
> + return migrate_multifd() && migrate_direct_io();
> +}
> +
> static bool transport_supports_seeking(MigrationAddress *addr)
> {
> if (addr->transport == MIGRATION_ADDRESS_TYPE_FILE) {
> @@ -164,6 +174,12 @@ static bool transport_supports_seeking(MigrationAddress *addr)
> return false;
> }
>
> +static bool transport_supports_extra_fds(MigrationAddress *addr)
> +{
> + /* file: works because QEMU can open it multiple times */
> + return addr->transport == MIGRATION_ADDRESS_TYPE_FILE;
> +}
> +
> static bool
> migration_channels_and_transport_compatible(MigrationAddress *addr,
> Error **errp)
> @@ -180,6 +196,13 @@ migration_channels_and_transport_compatible(MigrationAddress *addr,
> return false;
> }
>
> + if (migration_needs_extra_fds() &&
> + !transport_supports_extra_fds(addr)) {
> + error_setg(errp,
> + "Migration requires a transport that allows for extra fds (e.g. file)");
> + return false;
> + }
> +
> return true;
> }
>
> --
> 2.35.3
>
>
--
Peter Xu
next prev parent reply other threads:[~2024-05-30 21:36 UTC|newest]
Thread overview: 60+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-23 19:05 [PATCH v2 00/18] migration/mapped-ram: Add direct-io support Fabiano Rosas
2024-05-23 19:05 ` [PATCH v2 01/18] migration: Fix file migration with fdset Fabiano Rosas
2024-05-24 10:51 ` Prasad Pandit
2024-05-24 12:30 ` Fabiano Rosas
2024-05-25 6:16 ` Prasad Pandit
2024-05-30 16:11 ` Peter Xu
2024-05-31 14:58 ` Fabiano Rosas
2024-06-03 10:20 ` Daniel P. Berrangé
2024-05-23 19:05 ` [PATCH v2 02/18] tests/qtest/migration: Fix file migration offset check Fabiano Rosas
2024-05-30 16:14 ` Peter Xu
2024-06-03 10:21 ` Daniel P. Berrangé
2024-05-23 19:05 ` [PATCH v2 03/18] tests/qtest/migration: Add a precopy file test with fdset Fabiano Rosas
2024-05-30 16:18 ` Peter Xu
2024-05-23 19:05 ` [PATCH v2 04/18] monitor: Drop monitor_fdset_dup_fd_add() Fabiano Rosas
2024-06-03 10:26 ` Daniel P. Berrangé
2024-05-23 19:05 ` [PATCH v2 05/18] monitor: Introduce monitor_fdset_*free Fabiano Rosas
2024-05-30 20:03 ` Peter Xu
2024-05-31 15:01 ` Fabiano Rosas
2024-05-23 19:05 ` [PATCH v2 06/18] monitor: Stop removing non-duplicated fds Fabiano Rosas
2024-05-30 21:05 ` Peter Xu
2024-05-31 15:25 ` Fabiano Rosas
2024-05-31 15:56 ` Peter Xu
2024-06-04 23:40 ` Dr. David Alan Gilbert
2024-06-05 12:31 ` Fabiano Rosas
2024-05-23 19:05 ` [PATCH v2 07/18] monitor: Simplify fdset and fd removal Fabiano Rosas
2024-05-31 15:58 ` Peter Xu
2024-05-23 19:05 ` [PATCH v2 08/18] monitor: Report errors from monitor_fdset_dup_fd_add Fabiano Rosas
2024-05-30 21:08 ` Peter Xu
2024-05-23 19:05 ` [PATCH v2 09/18] io: Stop using qemu_open_old in channel-file Fabiano Rosas
2024-05-30 21:10 ` Peter Xu
2024-05-23 19:05 ` [PATCH v2 10/18] migration: Add direct-io parameter Fabiano Rosas
2024-05-30 21:12 ` Peter Xu
2024-05-23 19:05 ` [PATCH v2 11/18] migration/multifd: Add direct-io support Fabiano Rosas
2024-05-30 21:35 ` Peter Xu [this message]
2024-05-31 15:27 ` Fabiano Rosas
2024-05-23 19:05 ` [PATCH v2 12/18] tests/qtest/migration: Add tests for file migration with direct-io Fabiano Rosas
2024-05-23 19:05 ` [PATCH v2 13/18] monitor: fdset: Match against O_DIRECT Fabiano Rosas
2024-05-30 21:41 ` Peter Xu
2024-05-31 15:42 ` Fabiano Rosas
2024-05-31 15:58 ` Peter Xu
2024-05-23 19:05 ` [PATCH v2 14/18] migration: Add documentation for fdset with multifd + file Fabiano Rosas
2024-06-04 20:46 ` Peter Xu
2024-05-23 19:05 ` [PATCH v2 15/18] tests/qtest/migration: Add a test for mapped-ram with passing of fds Fabiano Rosas
2024-06-04 20:51 ` Peter Xu
2024-05-23 19:05 ` [PATCH v2 16/18] io/channel-file: Add direct-io support Fabiano Rosas
2024-06-03 10:32 ` Daniel P. Berrangé
2024-05-23 19:05 ` [PATCH v2 17/18] migration: Add direct-io helpers Fabiano Rosas
2024-05-23 19:05 ` [PATCH v2 18/18] migration/ram: Add direct-io support to precopy file migration Fabiano Rosas
2024-06-04 20:56 ` Peter Xu
2024-06-07 18:42 ` Fabiano Rosas
2024-06-07 20:39 ` Jim Fehlig
2024-06-10 16:09 ` Peter Xu
2024-06-10 17:45 ` Fabiano Rosas
2024-06-10 19:02 ` Peter Xu
2024-06-10 19:07 ` Daniel P. Berrangé
2024-06-10 20:12 ` Fabiano Rosas
2024-06-12 18:08 ` Fabiano Rosas
2024-06-12 18:15 ` Daniel P. Berrangé
2024-06-12 18:27 ` Peter Xu
2024-06-12 18:44 ` Fabiano Rosas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZljxGhSFhMFKt584@x1n \
--to=peterx@redhat.com \
--cc=armbru@redhat.com \
--cc=berrange@redhat.com \
--cc=cfontana@suse.de \
--cc=farosas@suse.de \
--cc=jfehlig@suse.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.