From: "Daniel P. Berrangé" <berrange@redhat.com>
To: Leonardo Bras <leobras@redhat.com>
Cc: qemu-devel@nongnu.org, Markus Armbruster <armbru@redhat.com>,
Eric Blake <eblake@redhat.com>,
"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
Juan Quintela <quintela@redhat.com>
Subject: Re: [PATCH v5 1/6] QIOChannel: Add io_writev_zerocopy & io_flush_zerocopy callbacks
Date: Fri, 12 Nov 2021 10:13:01 +0000 [thread overview]
Message-ID: <YY4+LWnRTV7iaErs@redhat.com> (raw)
In-Reply-To: <20211112051040.923746-2-leobras@redhat.com>
On Fri, Nov 12, 2021 at 02:10:36AM -0300, Leonardo Bras wrote:
> Adds io_writev_zerocopy and io_flush_zerocopy as optional callback to QIOChannelClass,
> allowing the implementation of zerocopy writes by subclasses.
>
> How to use them:
> - Write data using qio_channel_writev_zerocopy(),
> - Wait write completion with qio_channel_flush_zerocopy().
>
> Notes:
> As some zerocopy implementations work asynchronously, it's
> recommended to keep the write buffer untouched until the return of
> qio_channel_flush_zerocopy(), to avoid the risk of sending an updated
> buffer instead of the one at the write.
>
> As the new callbacks are optional, if a subclass does not implement them, then:
> - io_writev_zerocopy will return -1,
> - io_flush_zerocopy will return 0 without changing anything.
>
> Also, some functions like qio_channel_writev_full_all() were adapted to
> receive a flag parameter. That allows shared code between zerocopy and
> non-zerocopy writev.
>
> Signed-off-by: Leonardo Bras <leobras@redhat.com>
> ---
> include/io/channel.h | 93 ++++++++++++++++++++++++++++++++++++++------
> io/channel.c | 65 +++++++++++++++++++++++++------
> 2 files changed, 135 insertions(+), 23 deletions(-)
>
> diff --git a/include/io/channel.h b/include/io/channel.h
> index 88988979f8..a19c09bb84 100644
> --- a/include/io/channel.h
> +++ b/include/io/channel.h
> @@ -32,12 +32,15 @@ OBJECT_DECLARE_TYPE(QIOChannel, QIOChannelClass,
>
> #define QIO_CHANNEL_ERR_BLOCK -2
>
> +#define QIO_CHANNEL_WRITE_FLAG_ZEROCOPY 0x1
> +
> typedef enum QIOChannelFeature QIOChannelFeature;
>
> enum QIOChannelFeature {
> QIO_CHANNEL_FEATURE_FD_PASS,
> QIO_CHANNEL_FEATURE_SHUTDOWN,
> QIO_CHANNEL_FEATURE_LISTEN,
> + QIO_CHANNEL_FEATURE_WRITE_ZEROCOPY,
> };
>
>
> @@ -136,6 +139,12 @@ struct QIOChannelClass {
> IOHandler *io_read,
> IOHandler *io_write,
> void *opaque);
> + ssize_t (*io_writev_zerocopy)(QIOChannel *ioc,
> + const struct iovec *iov,
> + size_t niov,
> + Error **errp);
> + int (*io_flush_zerocopy)(QIOChannel *ioc,
> + Error **errp);
> };
>
> /* General I/O handling functions */
> @@ -321,10 +330,11 @@ int qio_channel_readv_all(QIOChannel *ioc,
>
>
> /**
> - * qio_channel_writev_all:
> + * qio_channel_writev_all_flags:
> * @ioc: the channel object
> * @iov: the array of memory regions to write data from
> * @niov: the length of the @iov array
> + * @flags: write flags (QIO_CHANNEL_WRITE_FLAG_*)
> * @errp: pointer to a NULL-initialized error object
> *
> * Write data to the IO channel, reading it from the
> @@ -337,12 +347,23 @@ int qio_channel_readv_all(QIOChannel *ioc,
> * to be written, yielding from the current coroutine
> * if required.
> *
> + * If QIO_CHANNEL_WRITE_FLAG_ZEROCOPY is passed in flags,
> + * instead of waiting for all requested data to be written,
> + * this function will wait until it's all queued for writing.
> + * In this case, if the buffer gets changed between queueing and
> + * sending, the updated buffer will be sent. If this is not a
> + * desired behavior, it's suggested to call qio_channel_flush_zerocopy()
> + * before reusing the buffer.
> + *
> * Returns: 0 if all bytes were written, or -1 on error
> */
> -int qio_channel_writev_all(QIOChannel *ioc,
> - const struct iovec *iov,
> - size_t niov,
> - Error **erp);
> +int qio_channel_writev_all_flags(QIOChannel *ioc,
> + const struct iovec *iov,
> + size_t niov,
> + int flags,
> + Error **errp);
> +#define qio_channel_writev_all(ioc, iov, niov, errp) \
> + qio_channel_writev_all_flags(ioc, iov, niov, 0, errp)
We already have separate methods for zerocopy, instead of adding
flags, so we shouldn't add flags to this either.
Add a qio_channel_writev_zerocopy_all method instead.
Internally, we can still make both qio_channel_writev_zerocopy_all
and qio_channel_writev_all use the same helper method, just don't
expose flags in the public API. Even internally we don't really
need flags, just a bool
>
> /**
> * qio_channel_readv:
> @@ -831,12 +852,13 @@ int qio_channel_readv_full_all(QIOChannel *ioc,
> Error **errp);
>
> /**
> - * qio_channel_writev_full_all:
> + * qio_channel_writev_full_all_flags:
> * @ioc: the channel object
> * @iov: the array of memory regions to write data from
> * @niov: the length of the @iov array
> * @fds: an array of file handles to send
> * @nfds: number of file handles in @fds
> + * @flags: write flags (QIO_CHANNEL_WRITE_FLAG_*)
> * @errp: pointer to a NULL-initialized error object
> *
> *
> @@ -846,13 +868,62 @@ int qio_channel_readv_full_all(QIOChannel *ioc,
> * to be written, yielding from the current coroutine
> * if required.
> *
> + * If QIO_CHANNEL_WRITE_FLAG_ZEROCOPY is passed in flags,
> + * instead of waiting for all requested data to be written,
> + * this function will wait until it's all queued for writing.
> + * In this case, if the buffer gets changed between queueing and
> + * sending, the updated buffer will be sent. If this is not a
> + * desired behavior, it's suggested to call qio_channel_flush_zerocopy()
> + * before reusing the buffer.
> + *
> * Returns: 0 if all bytes were written, or -1 on error
> */
>
> -int qio_channel_writev_full_all(QIOChannel *ioc,
> - const struct iovec *iov,
> - size_t niov,
> - int *fds, size_t nfds,
> - Error **errp);
> +int qio_channel_writev_full_all_flags(QIOChannel *ioc,
> + const struct iovec *iov,
> + size_t niov,
> + int *fds, size_t nfds,
> + int flags, Error **errp);
> +#define qio_channel_writev_full_all(ioc, iov, niov, fds, nfds, errp) \
> + qio_channel_writev_full_all_flags(ioc, iov, niov, fds, nfds, 0, errp)
There's no need for this at all. Since fd passing is not supported
with zerocopy, there's no reason to ever use this method.
> +/**
> + * qio_channel_writev_zerocopy:
> + * @ioc: the channel object
> + * @iov: the array of memory regions to write data from
> + * @niov: the length of the @iov array
> + * @errp: pointer to a NULL-initialized error object
> + *
> + * Behaves like qio_channel_writev_full_all_flags, but may write
qio_channel_writev
> + * data asynchronously while avoiding unnecessary data copy.
> + * This function may return before any data is actually written,
> + * but should queue every buffer for writing.
Callers mustn't rely on "should" docs - they must rely on the
return value indicating how many bytes were accepted.
> + *
> + * If at some point it's necessary to wait for all data to be
> + * written, use qio_channel_flush_zerocopy().
> + *
> + * If zerocopy is not available, returns -1 and set errp.
> + */
> +
> +ssize_t qio_channel_writev_zerocopy(QIOChannel *ioc,
> + const struct iovec *iov,
> + size_t niov,
> + Error **errp);
> +
> +/**
> + * qio_channel_flush_zerocopy:
> + * @ioc: the channel object
> + * @errp: pointer to a NULL-initialized error object
> + *
> + * Will block until every packet queued with
> + * qio_channel_writev_zerocopy() is sent, or return
> + * in case of any error.
> + *
> + * Returns -1 if any error is found, 0 otherwise.
> + * If not implemented, acts as a no-op, and returns 0.
> + */
> +
> +int qio_channel_flush_zerocopy(QIOChannel *ioc,
> + Error **errp);
>
> #endif /* QIO_CHANNEL_H */
> diff --git a/io/channel.c b/io/channel.c
> index e8b019dc36..009da9b772 100644
> --- a/io/channel.c
> +++ b/io/channel.c
> @@ -212,19 +212,21 @@ int qio_channel_readv_full_all(QIOChannel *ioc,
> return ret;
> }
>
> -int qio_channel_writev_all(QIOChannel *ioc,
> - const struct iovec *iov,
> - size_t niov,
> - Error **errp)
> +int qio_channel_writev_all_flags(QIOChannel *ioc,
> + const struct iovec *iov,
> + size_t niov,
> + int flags,
> + Error **errp)
> {
> - return qio_channel_writev_full_all(ioc, iov, niov, NULL, 0, errp);
> + return qio_channel_writev_full_all_flags(ioc, iov, niov, NULL, 0, flags,
> + errp);
> }
>
> -int qio_channel_writev_full_all(QIOChannel *ioc,
> - const struct iovec *iov,
> - size_t niov,
> - int *fds, size_t nfds,
> - Error **errp)
> +int qio_channel_writev_full_all_flags(QIOChannel *ioc,
> + const struct iovec *iov,
> + size_t niov,
> + int *fds, size_t nfds,
> + int flags, Error **errp)
> {
> int ret = -1;
> struct iovec *local_iov = g_new(struct iovec, niov);
> @@ -237,8 +239,15 @@ int qio_channel_writev_full_all(QIOChannel *ioc,
>
> while (nlocal_iov > 0) {
> ssize_t len;
> - len = qio_channel_writev_full(ioc, local_iov, nlocal_iov, fds, nfds,
> - errp);
> +
> + if (flags & QIO_CHANNEL_WRITE_FLAG_ZEROCOPY) {
> + assert(fds == NULL && nfds == 0);
> + len = qio_channel_writev_zerocopy(ioc, local_iov, nlocal_iov, errp);
> + } else {
> + len = qio_channel_writev_full(ioc, local_iov, nlocal_iov, fds, nfds,
> + errp);
> + }
> +
> if (len == QIO_CHANNEL_ERR_BLOCK) {
> if (qemu_in_coroutine()) {
> qio_channel_yield(ioc, G_IO_OUT);
> @@ -474,6 +483,38 @@ off_t qio_channel_io_seek(QIOChannel *ioc,
> }
>
>
> +ssize_t qio_channel_writev_zerocopy(QIOChannel *ioc,
> + const struct iovec *iov,
> + size_t niov,
> + Error **errp)
> +{
> + QIOChannelClass *klass = QIO_CHANNEL_GET_CLASS(ioc);
> +
> + if (!klass->io_writev_zerocopy ||
> + !qio_channel_has_feature(ioc, QIO_CHANNEL_FEATURE_WRITE_ZEROCOPY)) {
> + error_setg_errno(errp, EINVAL,
> + "Channel does not support zerocopy writev");
> + return -1;
> + }
> +
> + return klass->io_writev_zerocopy(ioc, iov, niov, errp);
> +}
> +
> +
> +int qio_channel_flush_zerocopy(QIOChannel *ioc,
> + Error **errp)
> +{
> + QIOChannelClass *klass = QIO_CHANNEL_GET_CLASS(ioc);
> +
> + if (!klass->io_flush_zerocopy ||
> + !qio_channel_has_feature(ioc, QIO_CHANNEL_FEATURE_WRITE_ZEROCOPY)) {
> + return 0;
> + }
> +
> + return klass->io_flush_zerocopy(ioc, errp);
> +}
> +
> +
> static void qio_channel_restart_read(void *opaque)
> {
> QIOChannel *ioc = opaque;
> --
> 2.33.1
>
Regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
next prev parent reply other threads:[~2021-11-12 10:14 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-11-12 5:10 [PATCH v5 0/6] MSG_ZEROCOPY + multifd Leonardo Bras
2021-11-12 5:10 ` [PATCH v5 1/6] QIOChannel: Add io_writev_zerocopy & io_flush_zerocopy callbacks Leonardo Bras
2021-11-12 10:13 ` Daniel P. Berrangé [this message]
2021-11-12 10:26 ` Daniel P. Berrangé
2021-11-22 23:18 ` Leonardo Bras Soares Passos
2021-11-23 9:45 ` Daniel P. Berrangé
2021-12-03 5:24 ` Leonardo Bras Soares Passos
2021-12-03 9:15 ` Daniel P. Berrangé
2021-11-12 10:56 ` Daniel P. Berrangé
2021-11-12 5:10 ` [PATCH v5 2/6] QIOChannelSocket: Add flags parameter for writing Leonardo Bras
2021-11-12 10:15 ` Daniel P. Berrangé
2021-11-23 5:33 ` Leonardo Bras Soares Passos
2021-11-12 5:10 ` [PATCH v5 3/6] QIOChannelSocket: Implement io_writev_zerocopy & io_flush_zerocopy for CONFIG_LINUX Leonardo Bras
2021-11-12 10:54 ` Daniel P. Berrangé
2021-11-23 4:46 ` Leonardo Bras Soares Passos
2021-11-23 9:55 ` Daniel P. Berrangé
2021-12-03 5:42 ` Leonardo Bras Soares Passos
2021-12-03 9:17 ` Daniel P. Berrangé
2021-12-09 8:38 ` Leonardo Bras Soares Passos
2021-12-09 8:49 ` Leonardo Bras Soares Passos
2021-11-12 5:10 ` [PATCH v5 4/6] migration: Add zerocopy parameter for QMP/HMP for Linux Leonardo Bras
2021-11-12 11:04 ` Juan Quintela
2021-11-12 11:08 ` Daniel P. Berrangé
2021-11-12 11:59 ` Markus Armbruster
2021-12-01 19:07 ` Leonardo Bras Soares Passos
2021-11-12 12:01 ` Markus Armbruster
2021-12-02 4:31 ` Leonardo Bras Soares Passos
2021-12-01 18:51 ` Leonardo Bras Soares Passos
2021-11-12 11:05 ` Daniel P. Berrangé
2021-12-01 19:05 ` Leonardo Bras Soares Passos
2021-11-12 5:10 ` [PATCH v5 5/6] migration: Add migrate_use_tls() helper Leonardo Bras
2021-11-12 11:04 ` Juan Quintela
2021-11-30 19:00 ` Leonardo Bras Soares Passos
2021-11-12 5:10 ` [PATCH v5 6/6] multifd: Implement zerocopy write in multifd migration (multifd-zerocopy) Leonardo Bras
2021-11-16 16:08 ` Juan Quintela
2021-11-16 16:17 ` Daniel P. Berrangé
2021-11-16 16:34 ` Juan Quintela
2021-11-16 16:39 ` Daniel P. Berrangé
2021-12-02 6:56 ` Leonardo Bras Soares Passos
2021-11-16 16:34 ` Daniel P. Berrangé
2021-12-02 6:54 ` Leonardo Bras Soares Passos
2021-12-02 6:47 ` Leonardo Bras Soares Passos
2021-12-02 12:10 ` Juan Quintela
2021-12-09 8:51 ` Leonardo Bras Soares Passos
2021-12-09 9:42 ` Leonardo Bras Soares Passos
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YY4+LWnRTV7iaErs@redhat.com \
--to=berrange@redhat.com \
--cc=armbru@redhat.com \
--cc=dgilbert@redhat.com \
--cc=eblake@redhat.com \
--cc=leobras@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).