qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Daniel P. Berrangé" <berrange@redhat.com>
To: Fabiano Rosas <farosas@suse.de>
Cc: qemu-devel@nongnu.org, armbru@redhat.com,
	Juan Quintela <quintela@redhat.com>, Peter Xu <peterx@redhat.com>,
	Leonardo Bras <leobras@redhat.com>,
	Claudio Fontana <cfontana@suse.de>
Subject: Re: [PATCH v2 22/29] io: Add a pwritev/preadv version that takes a discontiguous iovec
Date: Tue, 24 Oct 2023 09:50:01 +0100	[thread overview]
Message-ID: <ZTeFOUf6UzWUIXoY@redhat.com> (raw)
In-Reply-To: <20231023203608.26370-23-farosas@suse.de>

On Mon, Oct 23, 2023 at 05:36:01PM -0300, Fabiano Rosas wrote:
> For the upcoming support to fixed-ram migration with multifd, we need
> to be able to accept an iovec array with non-contiguous data.
> 
> Add a pwritev and preadv version that splits the array into contiguous
> segments before writing. With that we can have the ram code continue
> to add pages in any order and the multifd code continue to send large
> arrays for reading and writing.
> 
> Signed-off-by: Fabiano Rosas <farosas@suse.de>
> ---
> Since iovs can be non contiguous, we'd need a separate array on the
> side to carry an extra file offset for each of them, so I'm relying on
> the fact that iovs are all within a same host page and passing in an
> encoded offset that takes the host page into account.
> ---
>  include/io/channel.h | 50 +++++++++++++++++++++++++++
>  io/channel.c         | 82 ++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 132 insertions(+)
> 
> diff --git a/include/io/channel.h b/include/io/channel.h
> index a8181d576a..51a99fb9f6 100644
> --- a/include/io/channel.h
> +++ b/include/io/channel.h
> @@ -33,8 +33,10 @@ OBJECT_DECLARE_TYPE(QIOChannel, QIOChannelClass,
>  #define QIO_CHANNEL_ERR_BLOCK -2
>  
>  #define QIO_CHANNEL_WRITE_FLAG_ZERO_COPY 0x1
> +#define QIO_CHANNEL_WRITE_FLAG_WITH_OFFSET 0x2
>  
>  #define QIO_CHANNEL_READ_FLAG_MSG_PEEK 0x1
> +#define QIO_CHANNEL_READ_FLAG_WITH_OFFSET 0x2
>  
>  typedef enum QIOChannelFeature QIOChannelFeature;
>  
> @@ -559,6 +561,30 @@ int qio_channel_close(QIOChannel *ioc,
>  ssize_t qio_channel_pwritev_full(QIOChannel *ioc, const struct iovec *iov,
>                                   size_t niov, off_t offset, Error **errp);
>  
> +/**
> + * qio_channel_write_full_all:
> + * @ioc: the channel object
> + * @iov: the array of memory regions to write data from
> + * @niov: the length of the @iov array
> + * @offset: the iovec offset in the file where to write the data
> + * @fds: an array of file handles to send
> + * @nfds: number of file handles in @fds
> + * @flags: write flags (QIO_CHANNEL_WRITE_FLAG_*)
> + * @errp: pointer to a NULL-initialized error object
> + *
> + *
> + * Selects between a writev or pwritev channel writer function.
> + *
> + * If QIO_CHANNEL_WRITE_FLAG_OFFSET is passed in flags, pwritev is
> + * used and @offset is expected to be a meaningful value, @fds and
> + * @nfds are ignored; otherwise uses writev and @offset is ignored.
> + *
> + * Returns: 0 if all bytes were written, or -1 on error
> + */
> +int qio_channel_write_full_all(QIOChannel *ioc, const struct iovec *iov,
> +                               size_t niov, off_t offset, int *fds, size_t nfds,
> +                               int flags, Error **errp);
> +
>  /**
>   * qio_channel_pwritev
>   * @ioc: the channel object
> @@ -595,6 +621,30 @@ ssize_t qio_channel_pwritev(QIOChannel *ioc, char *buf, size_t buflen,
>  ssize_t qio_channel_preadv_full(QIOChannel *ioc, const struct iovec *iov,
>                                  size_t niov, off_t offset, Error **errp);
>  
> +/**
> + * qio_channel_read_full_all:
> + * @ioc: the channel object
> + * @iov: the array of memory regions to read data to
> + * @niov: the length of the @iov array
> + * @offset: the iovec offset in the file from where to read the data
> + * @fds: an array of file handles to send
> + * @nfds: number of file handles in @fds
> + * @flags: read flags (QIO_CHANNEL_READ_FLAG_*)
> + * @errp: pointer to a NULL-initialized error object
> + *
> + *
> + * Selects between a readv or preadv channel reader function.
> + *
> + * If QIO_CHANNEL_READ_FLAG_OFFSET is passed in flags, preadv is
> + * used and @offset is expected to be a meaningful value, @fds and
> + * @nfds are ignored; otherwise uses readv and @offset is ignored.
> + *
> + * Returns: 0 if all bytes were read, or -1 on error
> + */
> +int qio_channel_read_full_all(QIOChannel *ioc, const struct iovec *iov,
> +                              size_t niov, off_t offset,
> +                              int flags, Error **errp);
> +
>  /**
>   * qio_channel_preadv
>   * @ioc: the channel object
> diff --git a/io/channel.c b/io/channel.c
> index 770d61ea00..648b68451d 100644
> --- a/io/channel.c
> +++ b/io/channel.c
> @@ -472,6 +472,76 @@ ssize_t qio_channel_pwritev_full(QIOChannel *ioc, const struct iovec *iov,
>      return klass->io_pwritev(ioc, iov, niov, offset, errp);
>  }
>  
> +static int qio_channel_preadv_pwritev_contiguous(QIOChannel *ioc,
> +                                                 const struct iovec *iov,
> +                                                 size_t niov, off_t offset,
> +                                                 bool is_write, Error **errp)
> +{
> +    ssize_t ret;
> +    int i, slice_idx, slice_num;
> +    uint64_t base, next, file_offset;
> +    size_t len;
> +
> +    slice_idx = 0;
> +    slice_num = 1;
> +
> +    /*
> +     * If the iov array doesn't have contiguous elements, we need to
> +     * split it in slices because we only have one (file) 'offset' for
> +     * the whole iov. Do this here so callers don't need to break the
> +     * iov array themselves.
> +     */
> +    for (i = 0; i < niov; i++, slice_num++) {
> +        base = (uint64_t) iov[i].iov_base;
> +
> +        if (i != niov - 1) {
> +            len = iov[i].iov_len;
> +            next = (uint64_t) iov[i + 1].iov_base;
> +
> +            if (base + len == next) {
> +                continue;
> +            }
> +        }
> +
> +        /*
> +         * Use the offset of the first element of the segment that
> +         * we're sending.
> +         */
> +        file_offset = offset + (uint64_t) iov[slice_idx].iov_base;
> +
> +        if (is_write) {
> +            ret = qio_channel_pwritev_full(ioc, &iov[slice_idx], slice_num,
> +                                           file_offset, errp);
> +        } else {
> +            ret = qio_channel_preadv_full(ioc, &iov[slice_idx], slice_num,
> +                                          file_offset, errp);
> +        }
> +
> +        if (ret < 0) {
> +            break;
> +        }
> +
> +        slice_idx += slice_num;
> +        slice_num = 0;
> +    }
> +
> +    return (ret < 0) ? -1 : 0;
> +}
> +
> +int qio_channel_write_full_all(QIOChannel *ioc,
> +                                const struct iovec *iov,
> +                                size_t niov, off_t offset,
> +                                int *fds, size_t nfds,
> +                                int flags, Error **errp)
> +{
> +    if (flags & QIO_CHANNEL_WRITE_FLAG_WITH_OFFSET) {
> +        return qio_channel_preadv_pwritev_contiguous(ioc, iov, niov,
> +                                                     offset, true, errp);
> +    }
> +
> +    return qio_channel_writev_full_all(ioc, iov, niov, NULL, 0, flags, errp);
> +}

I don't much like this as a design, as it is two completely different
APIs shoved into one facade that is easy to misunderstand and misuse.
fds, nfds, and other flags values are all silently ignored in the first
branch. offset is silently ignored in the second branch. In fact there's
no functional benefit to the second branch at all, over calling the
existing apis.

I think that there should be qio_channel_{pwritev,preadv}_all
methods that take the 'flags' parameter.

> +
>  ssize_t qio_channel_pwritev(QIOChannel *ioc, char *buf, size_t buflen,
>                              off_t offset, Error **errp)
>  {
> @@ -501,6 +571,18 @@ ssize_t qio_channel_preadv_full(QIOChannel *ioc, const struct iovec *iov,
>      return klass->io_preadv(ioc, iov, niov, offset, errp);
>  }
>  
> +int qio_channel_read_full_all(QIOChannel *ioc, const struct iovec *iov,
> +                              size_t niov, off_t offset,
> +                              int flags, Error **errp)
> +{
> +    if (flags & QIO_CHANNEL_READ_FLAG_WITH_OFFSET) {
> +        return qio_channel_preadv_pwritev_contiguous(ioc, iov, niov,
> +                                                     offset, false, errp);
> +    }
> +
> +    return qio_channel_readv_full_all(ioc, iov, niov, NULL, NULL, errp);
> +}
> +
>  ssize_t qio_channel_preadv(QIOChannel *ioc, char *buf, size_t buflen,
>                             off_t offset, Error **errp)
>  {
> -- 
> 2.35.3
> 

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



  reply	other threads:[~2023-10-24  8:51 UTC|newest]

Thread overview: 128+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-23 20:35 [PATCH v2 00/29] migration: File based migration with multifd and fixed-ram Fabiano Rosas
2023-10-23 20:35 ` [PATCH v2 01/29] tests/qtest: migration events Fabiano Rosas
2023-10-25  9:44   ` Thomas Huth
2023-10-25 10:14   ` Daniel P. Berrangé
2023-10-25 13:21   ` Fabiano Rosas
2023-10-25 13:33     ` Steven Sistare
2023-10-23 20:35 ` [PATCH v2 02/29] tests/qtest: Move QTestMigrationState to libqtest Fabiano Rosas
2023-10-25 10:17   ` Daniel P. Berrangé
2023-10-25 13:19     ` Fabiano Rosas
2023-10-23 20:35 ` [PATCH v2 03/29] tests/qtest: Allow waiting for migration events Fabiano Rosas
2023-10-23 20:35 ` [PATCH v2 04/29] migration: Return the saved state from global_state_store Fabiano Rosas
2023-10-25 10:19   ` Daniel P. Berrangé
2023-10-23 20:35 ` [PATCH v2 05/29] migration: Introduce global_state_store_once Fabiano Rosas
2023-10-23 20:35 ` [PATCH v2 06/29] migration: Add auto-pause capability Fabiano Rosas
2023-10-24  5:25   ` Markus Armbruster
2023-10-24 18:12     ` Fabiano Rosas
2023-10-25  5:33       ` Markus Armbruster
2023-10-25  8:48   ` Daniel P. Berrangé
2023-10-25 13:57     ` Fabiano Rosas
2023-10-25 14:20       ` Daniel P. Berrangé
2023-10-25 14:58         ` Peter Xu
2023-10-25 15:25           ` Daniel P. Berrangé
2023-10-25 15:36             ` Peter Xu
2023-10-25 15:40               ` Daniel P. Berrangé
2023-10-25 17:20                 ` Peter Xu
2023-10-25 17:31                   ` Daniel P. Berrangé
2023-10-25 19:28                     ` Peter Xu
2023-10-23 20:35 ` [PATCH v2 07/29] migration: Run "file:" migration with a stopped VM Fabiano Rosas
2023-10-23 20:35 ` [PATCH v2 08/29] tests/qtest: File migration auto-pause tests Fabiano Rosas
2023-10-23 20:35 ` [PATCH v2 09/29] io: add and implement QIO_CHANNEL_FEATURE_SEEKABLE for channel file Fabiano Rosas
2023-10-23 20:35 ` [PATCH v2 10/29] io: Add generic pwritev/preadv interface Fabiano Rosas
2023-10-24  8:18   ` Daniel P. Berrangé
2023-10-24 19:06     ` Fabiano Rosas
2023-10-23 20:35 ` [PATCH v2 11/29] io: implement io_pwritev/preadv for QIOChannelFile Fabiano Rosas
2023-10-23 20:35 ` [PATCH v2 12/29] migration/qemu-file: add utility methods for working with seekable channels Fabiano Rosas
2023-10-25 10:22   ` Daniel P. Berrangé
2023-10-23 20:35 ` [PATCH v2 13/29] migration: fixed-ram: Add URI compatibility check Fabiano Rosas
2023-10-25 10:27   ` Daniel P. Berrangé
2023-10-31 16:06   ` Peter Xu
2023-10-23 20:35 ` [PATCH v2 14/29] migration/ram: Introduce 'fixed-ram' migration capability Fabiano Rosas
2023-10-24  5:33   ` Markus Armbruster
2023-10-24 18:35     ` Fabiano Rosas
2023-10-25  6:18       ` Markus Armbruster
2023-10-23 20:35 ` [PATCH v2 15/29] migration/ram: Add support for 'fixed-ram' outgoing migration Fabiano Rosas
2023-10-25  9:39   ` Daniel P. Berrangé
2023-10-25 14:03     ` Fabiano Rosas
2023-11-01 15:23     ` Peter Xu
2023-11-01 15:52       ` Daniel P. Berrangé
2023-11-01 16:24         ` Peter Xu
2023-11-01 16:37           ` Daniel P. Berrangé
2023-11-01 17:30             ` Peter Xu
2023-10-31 16:52   ` Peter Xu
2023-10-31 17:33     ` Fabiano Rosas
2023-10-31 17:59       ` Peter Xu
2023-10-23 20:35 ` [PATCH v2 16/29] migration/ram: Add support for 'fixed-ram' migration restore Fabiano Rosas
2023-10-25  9:43   ` Daniel P. Berrangé
2023-10-25 14:07     ` Fabiano Rosas
2023-10-31 19:03       ` Peter Xu
2023-11-01  9:26         ` Daniel P. Berrangé
2023-11-01 14:21           ` Peter Xu
2023-11-01 14:28             ` Daniel P. Berrangé
2023-11-01 15:00               ` Peter Xu
2023-11-06 13:18                 ` Fabiano Rosas
2023-11-06 21:00                   ` Peter Xu
2023-11-07  9:02                     ` Daniel P. Berrangé
2023-10-31 19:09   ` Peter Xu
2023-10-31 20:00     ` Fabiano Rosas
2023-10-23 20:35 ` [PATCH v2 17/29] tests/qtest: migration-test: Add tests for fixed-ram file-based migration Fabiano Rosas
2023-10-23 20:35 ` [PATCH v2 18/29] migration/multifd: Allow multifd without packets Fabiano Rosas
2023-10-23 20:35 ` [PATCH v2 19/29] migration/multifd: Add outgoing QIOChannelFile support Fabiano Rosas
2023-10-25  9:52   ` Daniel P. Berrangé
2023-10-25 14:12     ` Fabiano Rosas
2023-10-25 14:23       ` Daniel P. Berrangé
2023-10-25 15:00         ` Fabiano Rosas
2023-10-25 15:26           ` Daniel P. Berrangé
2023-10-31 20:11   ` Peter Xu
2023-10-23 20:35 ` [PATCH v2 20/29] migration/multifd: Add incoming " Fabiano Rosas
2023-10-25 10:29   ` Daniel P. Berrangé
2023-10-25 14:18     ` Fabiano Rosas
2023-10-31 21:28   ` Peter Xu
2023-10-23 20:36 ` [PATCH v2 21/29] migration/multifd: Add pages to the receiving side Fabiano Rosas
2023-10-31 22:10   ` Peter Xu
2023-10-31 23:18     ` Fabiano Rosas
2023-11-01 15:55       ` Peter Xu
2023-11-01 17:20         ` Fabiano Rosas
2023-11-01 17:35           ` Peter Xu
2023-11-01 18:14             ` Fabiano Rosas
2023-10-23 20:36 ` [PATCH v2 22/29] io: Add a pwritev/preadv version that takes a discontiguous iovec Fabiano Rosas
2023-10-24  8:50   ` Daniel P. Berrangé [this message]
2023-10-23 20:36 ` [PATCH v2 23/29] migration/ram: Add a wrapper for fixed-ram shadow bitmap Fabiano Rosas
2023-11-01 14:29   ` Peter Xu
2023-10-23 20:36 ` [PATCH v2 24/29] migration/ram: Ignore multifd flush when doing fixed-ram migration Fabiano Rosas
2023-10-25  9:09   ` Daniel P. Berrangé
2023-10-23 20:36 ` [PATCH v2 25/29] migration/multifd: Support outgoing fixed-ram stream format Fabiano Rosas
2023-10-25  9:23   ` Daniel P. Berrangé
2023-10-25 14:21     ` Fabiano Rosas
2023-10-23 20:36 ` [PATCH v2 26/29] migration/multifd: Support incoming " Fabiano Rosas
2023-10-23 20:36 ` [PATCH v2 27/29] tests/qtest: Add a multifd + fixed-ram migration test Fabiano Rosas
2023-10-23 20:36 ` [PATCH v2 28/29] migration: Add direct-io parameter Fabiano Rosas
2023-10-24  5:41   ` Markus Armbruster
2023-10-24 19:32     ` Fabiano Rosas
2023-10-25  6:23       ` Markus Armbruster
2023-10-25  8:44       ` Daniel P. Berrangé
2023-10-25 14:32         ` Fabiano Rosas
2023-10-25 14:43           ` Daniel P. Berrangé
2023-10-25 17:30             ` Fabiano Rosas
2023-10-25 17:45               ` Daniel P. Berrangé
2023-10-25 18:10                 ` Fabiano Rosas
2023-10-30 22:51             ` Fabiano Rosas
2023-10-31  9:03               ` Daniel P. Berrangé
2023-10-31 13:05                 ` Fabiano Rosas
2023-10-31 13:45                   ` Daniel P. Berrangé
2023-10-31 14:33                     ` Fabiano Rosas
2023-10-31 15:22                       ` Daniel P. Berrangé
2023-10-31 15:52                         ` Fabiano Rosas
2023-10-31 15:58                           ` Daniel P. Berrangé
2023-10-31 19:05                             ` Fabiano Rosas
2023-11-01  9:30                               ` Daniel P. Berrangé
2023-11-01 12:16                                 ` Fabiano Rosas
2023-11-01 12:23                                   ` Daniel P. Berrangé
2023-11-01 12:30                                     ` Fabiano Rosas
2023-10-24  8:33   ` Daniel P. Berrangé
2023-10-24 19:06     ` Fabiano Rosas
2023-10-25  9:07   ` Daniel P. Berrangé
2023-10-25 14:48     ` Fabiano Rosas
2023-10-25 15:22       ` Daniel P. Berrangé
2023-10-23 20:36 ` [PATCH v2 29/29] tests/qtest: Add a test for migration with direct-io and multifd Fabiano Rosas
2023-10-25  9:25   ` Daniel P. Berrangé

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZTeFOUf6UzWUIXoY@redhat.com \
    --to=berrange@redhat.com \
    --cc=armbru@redhat.com \
    --cc=cfontana@suse.de \
    --cc=farosas@suse.de \
    --cc=leobras@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).