From: Ben Knoble <ben.knoble@gmail.com>
To: Patrick Steinhardt <ps@pks.im>
Cc: git@vger.kernel.org, Junio C Hamano <gitster@pobox.com>,
Jeff King <peff@peff.net>,
"brian m. carlson" <sandals@crustytoothpaste.net>,
"Randall S. Becker" <randall.becker@nexbridge.ca>,
Phillip Wood <phillip.wood@dunelm.org.uk>,
Randall Becker <randall.becker@nexbridge.ca>
Subject: Re: [PATCH] wrapper: properly handle MAX_IO_SIZE in `write_in_full()`
Date: Thu, 9 Apr 2026 11:46:24 -0400 [thread overview]
Message-ID: <98E6F739-4ECA-44A4-8645-0B153C969E36@gmail.com> (raw)
In-Reply-To: <20260409-b4-pks-writev-max-io-size-v1-1-81730e8f35df@pks.im>
Admitting I am out of my depth here…
> Le 9 avr. 2026 à 08:52, Patrick Steinhardt <ps@pks.im> a écrit :
>
> Some systems like NonStop set a comparatively small `MAX_IO_SIZE`, which
> limits the maximum number of bytes we're allowed to write in a single
> call. We already handle this limit properly in `xwrite()`, but we have
> recently introduced wrappers for writev(3p) where we don't. This will
> cause the syscall to return EINVAL in case somebody passes an iovec
> entry to writev(3p) that is larger than `MAX_IO_SIZE`.
>
> Introduce a new function `xwritev()` that is similar to `xwrite()` in
> that it handles such platform-specific nuances. The logic is rather
> simple: we simply coalesce all iovecs that don't exceed `MAX_IO_SIZE`
> and pass those to writev(3p). If the first iovec already exceeds the
> limit, we'll instead pass it to `xwrite()`, which handles the limit for
> us.
>
> Adapt `writev_in_full()` to use this new wrapper. As this wrapper
> already knows to to call writev(3p) in a loop already it doesn't need
> any further adjustment.
>
> Reported-by: Randall Becker <randall.becker@nexbridge.ca>
> Helped-by: Jeff King <peff@peff.net>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
> Hi,
>
> this fixes the issue reported by Randall in [1].
>
> I mostly wanted to get this patch out there so that we can discuss a
> proposed fix, but as said in the thread I'm also happy to revise course
> and instead set NO_WRITEV on NonStop for now. I think we'll want to
> eventually land a fix like the one proposed here though, and at that
> point the workaround would not be required anymore.
>
> Thanks!
>
> Patrick
>
> [1]: <00f401dcc6e6$7183c0f0$548b42d0$@nexbridge.com>
> ---
> wrapper.c | 51 +++++++++++++++++++++++++++++++++++++++++++++------
> wrapper.h | 1 +
> 2 files changed, 46 insertions(+), 6 deletions(-)
>
> diff --git a/wrapper.c b/wrapper.c
> index be8fa575e6..d989c78b4b 100644
> --- a/wrapper.c
> +++ b/wrapper.c
> @@ -323,21 +323,60 @@ ssize_t write_in_full(int fd, const void *buf, size_t count)
> return total;
> }
>
> +ssize_t xwritev(int fd, struct iovec *iov, int iovcnt)
> +{
> + ssize_t bytes_written;
> + size_t total_length;
> + int i;
> +
> + /*
> + * We need to make sure that writev(3p) call does not write more than
> + * `MAX_IO_SIZE` many bytes. If we do exceed that limit, we only pass
> + * those iovecs to writev(3p) that sum up to less than the limit.
> + *
> + * If on the other hand the first iovec entry already exceeds this
> + * limit we'll instead use xwrite() to write it, which knows to handle
> + * `MAX_IO_SIZE` for us.
> + */
> + for (i = 0, total_length = 0; i < iovcnt; i++) {
> + if (unsigned_add_overflows(total_length, iov[i].iov_len))
> + break;
> +
> + total_length += iov[i].iov_len;
> + if (total_length > MAX_IO_SIZE)
> + break;
> + }
> +
> + if (i < iovcnt) {
> + /*
> + * The first entry exceeds MAX_IO_SIZE, so we pass it to
> + * xwrite, which knows to handle this case.
> + */
> + if (!i)
> + return xwrite(fd, iov->iov_base, iov->iov_len);
It took me starting to write this email wondering “but i could be >= 1?” to realize that this comment applies to the !i case below. Darn.
Still, I find the declaration (“the first entry exceeds”) before the check a bit confusing. Is that typical of our style (in which case leave it be)?
> + iovcnt = i;
> + }
> +
> + bytes_written = writev(fd, iov, iovcnt);
> + if (!bytes_written) {
> + errno = ENOSPC;
> + return -1;
> + }
> +
> + return bytes_written;
> +}
> +
> ssize_t writev_in_full(int fd, struct iovec *iov, int iovcnt)
> {
> ssize_t total_written = 0;
>
> while (iovcnt) {
> - ssize_t bytes_written = writev(fd, iov, iovcnt);
> - if (bytes_written < 0) {
> + ssize_t bytes_written = xwritev(fd, iov, iovcnt);
> + if (bytes_written <= 0) {
> if (errno == EINTR || errno == EAGAIN)
> continue;
> return -1;
> }
> - if (!bytes_written) {
> - errno = ENOSPC;
> - return -1;
> - }
>
> total_written += bytes_written;
>
> diff --git a/wrapper.h b/wrapper.h
> index 27519b32d1..a6287d7f4d 100644
> --- a/wrapper.h
> +++ b/wrapper.h
> @@ -16,6 +16,7 @@ void *xmmap_gently(void *start, size_t length, int prot, int flags, int fd, off_
> int xopen(const char *path, int flags, ...);
> ssize_t xread(int fd, void *buf, size_t len);
> ssize_t xwrite(int fd, const void *buf, size_t len);
> +ssize_t xwritev(int fd, struct iovec *iov, int iovcnt);
> ssize_t xpread(int fd, void *buf, size_t len, off_t offset);
> int xdup(int fd);
> FILE *xfopen(const char *path, const char *mode);
>
> ---
> base-commit: b15384c06f77bc2d34d0d3623a8a58218313a561
> change-id: 20260409-b4-pks-writev-max-io-size-e9b803439ae8
>
>
next prev parent reply other threads:[~2026-04-09 15:46 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-09 12:47 [PATCH] wrapper: properly handle MAX_IO_SIZE in `write_in_full()` Patrick Steinhardt
2026-04-09 15:46 ` Ben Knoble [this message]
2026-04-09 16:42 ` Junio C Hamano
2026-04-09 20:23 ` Jeff King
2026-04-09 20:40 ` Junio C Hamano
2026-04-09 20:59 ` Jeff King
2026-04-09 21:09 ` Junio C Hamano
2026-04-10 5:19 ` Patrick Steinhardt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=98E6F739-4ECA-44A4-8645-0B153C969E36@gmail.com \
--to=ben.knoble@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=peff@peff.net \
--cc=phillip.wood@dunelm.org.uk \
--cc=ps@pks.im \
--cc=randall.becker@nexbridge.ca \
--cc=sandals@crustytoothpaste.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox