Git development
 help / color / mirror / Atom feed
From: Ben Knoble <ben.knoble@gmail.com>
To: Patrick Steinhardt <ps@pks.im>
Cc: git@vger.kernel.org, Junio C Hamano <gitster@pobox.com>,
	Jeff King <peff@peff.net>,
	"brian m. carlson" <sandals@crustytoothpaste.net>,
	"Randall S. Becker" <randall.becker@nexbridge.ca>,
	Phillip Wood <phillip.wood@dunelm.org.uk>,
	Randall Becker <randall.becker@nexbridge.ca>
Subject: Re: [PATCH] wrapper: properly handle MAX_IO_SIZE in `write_in_full()`
Date: Thu, 9 Apr 2026 11:46:24 -0400	[thread overview]
Message-ID: <98E6F739-4ECA-44A4-8645-0B153C969E36@gmail.com> (raw)
In-Reply-To: <20260409-b4-pks-writev-max-io-size-v1-1-81730e8f35df@pks.im>

Admitting I am out of my depth here…

> Le 9 avr. 2026 à 08:52, Patrick Steinhardt <ps@pks.im> a écrit :
> 
> Some systems like NonStop set a comparatively small `MAX_IO_SIZE`, which
> limits the maximum number of bytes we're allowed to write in a single
> call. We already handle this limit properly in `xwrite()`, but we have
> recently introduced wrappers for writev(3p) where we don't. This will
> cause the syscall to return EINVAL in case somebody passes an iovec
> entry to writev(3p) that is larger than `MAX_IO_SIZE`.
> 
> Introduce a new function `xwritev()` that is similar to `xwrite()` in
> that it handles such platform-specific nuances. The logic is rather
> simple: we simply coalesce all iovecs that don't exceed `MAX_IO_SIZE`
> and pass those to writev(3p). If the first iovec already exceeds the
> limit, we'll instead pass it to `xwrite()`, which handles the limit for
> us.
> 
> Adapt `writev_in_full()` to use this new wrapper. As this wrapper
> already knows to to call writev(3p) in a loop already it doesn't need
> any further adjustment.
> 
> Reported-by: Randall Becker <randall.becker@nexbridge.ca>
> Helped-by: Jeff King <peff@peff.net>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
> Hi,
> 
> this fixes the issue reported by Randall in [1].
> 
> I mostly wanted to get this patch out there so that we can discuss a
> proposed fix, but as said in the thread I'm also happy to revise course
> and instead set NO_WRITEV on NonStop for now. I think we'll want to
> eventually land a fix like the one proposed here though, and at that
> point the workaround would not be required anymore.
> 
> Thanks!
> 
> Patrick
> 
> [1]: <00f401dcc6e6$7183c0f0$548b42d0$@nexbridge.com>
> ---
> wrapper.c | 51 +++++++++++++++++++++++++++++++++++++++++++++------
> wrapper.h |  1 +
> 2 files changed, 46 insertions(+), 6 deletions(-)
> 
> diff --git a/wrapper.c b/wrapper.c
> index be8fa575e6..d989c78b4b 100644
> --- a/wrapper.c
> +++ b/wrapper.c
> @@ -323,21 +323,60 @@ ssize_t write_in_full(int fd, const void *buf, size_t count)
>    return total;
> }
> 
> +ssize_t xwritev(int fd, struct iovec *iov, int iovcnt)
> +{
> +    ssize_t bytes_written;
> +    size_t total_length;
> +    int i;
> +
> +    /*
> +     * We need to make sure that writev(3p) call does not write more than
> +     * `MAX_IO_SIZE` many bytes. If we do exceed that limit, we only pass
> +     * those iovecs to writev(3p) that sum up to less than the limit.
> +     *
> +     * If on the other hand the first iovec entry already exceeds this
> +     * limit we'll instead use xwrite() to write it, which knows to handle
> +     * `MAX_IO_SIZE` for us.
> +     */
> +    for (i = 0, total_length = 0; i < iovcnt; i++) {
> +        if (unsigned_add_overflows(total_length, iov[i].iov_len))
> +            break;
> +
> +        total_length += iov[i].iov_len;
> +        if (total_length > MAX_IO_SIZE)
> +            break;
> +    }
> +
> +    if (i < iovcnt) {
> +        /*
> +         * The first entry exceeds MAX_IO_SIZE, so we pass it to
> +         * xwrite, which knows to handle this case.
> +         */
> +        if (!i)
> +            return xwrite(fd, iov->iov_base, iov->iov_len);

It took me starting to write this email wondering “but i could be >= 1?” to realize that this comment applies to the !i case below. Darn.

Still, I find the declaration (“the first entry exceeds”) before the check a bit confusing. Is that typical of our style (in which case leave it be)?

> +        iovcnt = i;
> +    }
> +
> +    bytes_written = writev(fd, iov, iovcnt);
> +    if (!bytes_written) {
> +        errno = ENOSPC;
> +        return -1;
> +    }
> +
> +    return bytes_written;
> +}
> +
> ssize_t writev_in_full(int fd, struct iovec *iov, int iovcnt)
> {
>    ssize_t total_written = 0;
> 
>    while (iovcnt) {
> -        ssize_t bytes_written = writev(fd, iov, iovcnt);
> -        if (bytes_written < 0) {
> +        ssize_t bytes_written = xwritev(fd, iov, iovcnt);
> +        if (bytes_written <= 0) {
>            if (errno == EINTR || errno == EAGAIN)
>                continue;
>            return -1;
>        }
> -        if (!bytes_written) {
> -            errno = ENOSPC;
> -            return -1;
> -        }
> 
>        total_written += bytes_written;
> 
> diff --git a/wrapper.h b/wrapper.h
> index 27519b32d1..a6287d7f4d 100644
> --- a/wrapper.h
> +++ b/wrapper.h
> @@ -16,6 +16,7 @@ void *xmmap_gently(void *start, size_t length, int prot, int flags, int fd, off_
> int xopen(const char *path, int flags, ...);
> ssize_t xread(int fd, void *buf, size_t len);
> ssize_t xwrite(int fd, const void *buf, size_t len);
> +ssize_t xwritev(int fd, struct iovec *iov, int iovcnt);
> ssize_t xpread(int fd, void *buf, size_t len, off_t offset);
> int xdup(int fd);
> FILE *xfopen(const char *path, const char *mode);
> 
> ---
> base-commit: b15384c06f77bc2d34d0d3623a8a58218313a561
> change-id: 20260409-b4-pks-writev-max-io-size-e9b803439ae8
> 
> 

  reply	other threads:[~2026-04-09 15:46 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-09 12:47 [PATCH] wrapper: properly handle MAX_IO_SIZE in `write_in_full()` Patrick Steinhardt
2026-04-09 15:46 ` Ben Knoble [this message]
2026-04-09 16:42 ` Junio C Hamano
2026-04-09 20:23   ` Jeff King
2026-04-09 20:40     ` Junio C Hamano
2026-04-09 20:59       ` Jeff King
2026-04-09 21:09         ` Junio C Hamano
2026-04-10  5:19           ` Patrick Steinhardt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=98E6F739-4ECA-44A4-8645-0B153C969E36@gmail.com \
    --to=ben.knoble@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    --cc=phillip.wood@dunelm.org.uk \
    --cc=ps@pks.im \
    --cc=randall.becker@nexbridge.ca \
    --cc=sandals@crustytoothpaste.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox