* [PATCH 0/3] vmsplice: make vmsplice a trivial wrapper for preadv2/pwritev2
@ 2026-05-31 1:01 Askar Safin
2026-05-31 1:01 ` [PATCH 1/3] tee: fs/splice.c: remove unused parameter "flags" from "link_pipe" Askar Safin
` (4 more replies)
0 siblings, 5 replies; 12+ messages in thread
From: Askar Safin @ 2026-05-31 1:01 UTC (permalink / raw)
To: linux-fsdevel, Christian Brauner, Alexander Viro, Jan Kara
Cc: linux-kernel, linux-mm, linux-api, netdev, Linus Torvalds,
Matthew Wilcox, Jens Axboe, Christoph Hellwig, David Howells,
Andrew Morton, David Hildenbrand, Pedro Falcato, Miklos Szeredi,
patches
This patchset is for VFS.
Recently we got a lot of vulnerabilities in splice/vmsplice.
Also vmsplice already was source of vulnerabilities in the past:
CVE-2020-29374 (see https://lwn.net/Articles/849638/ ).
Also vmsplice is problematic for other reasons. Here is what other
developers say:
Linus Torvalds in 2023:
> So I'd personally be perfectly ok with just making vmsplice() be
> exactly the same as write, and turn all of vmsplice() into just "it's
> a read() if the pipe is open for read, and a write if it's open for
> writing".
https://lore.kernel.org/all/CAHk-=wgG_2cmHgZwKjydi7=iimyHyN8aessnbM9XQ9ufbaUz9g@mail.gmail.com/
Christoph Hellwig in May 2026:
> vmsplice is the worst, as it is one of the few remaining places that
> can incorrectly dirty file backed pages without telling the file system
> and cause the other problems fixed by a FOLL_PIN conversion, but it is
> the only one where we do not have any idea yet how we could convert it
> to FOLL_PIN due to the unbounded pin time.
https://lore.kernel.org/all/agwFlBKvKytjURDO@infradead.org/
See recent discussion here:
https://lore.kernel.org/all/20260516182126.530498-1-pfalcato@suse.de/T/#u
For all these reasons I propose to make vmsplice a simple wrapper for
preadv2/pwritev2.
vmsplice(fd, vec, vlen, vmsplice_flags) will
be equivalent to preadv2(fd, vec, vlen, -1, rw_flags) if you have
readable pipe and to pwritev2(fd, vec, vlen, -1, rw_flags) if you have
writable pipe.
SPLICE_F_NONBLOCK is translated to RWF_NOWAIT, all other SPLICE_F_*
flags are ignored.
There is a small change to handling of NONBLOCK-related flags,
see commit messages for details.
I tested this patch in Qemu.
This patchset was written by me, not by LLMs.
Askar Safin (3):
tee: fs/splice.c: remove unused parameter "flags" from "link_pipe"
vmsplice: make vmsplice a trivial wrapper for preadv2/pwritev2
splice: remove PIPE_BUF_FLAG_GIFT
fs/fuse/dev.c | 1 -
fs/read_write.c | 23 +++++
fs/splice.c | 202 +-------------------------------------
include/linux/pipe_fs_i.h | 1 -
include/linux/skbuff.h | 4 +-
include/linux/splice.h | 2 +-
include/linux/syscalls.h | 4 +-
7 files changed, 33 insertions(+), 204 deletions(-)
base-commit: e7ae89a0c97ce2b68b0983cd01eda67cf373517d (7.1-rc5)
--
2.47.3
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 1/3] tee: fs/splice.c: remove unused parameter "flags" from "link_pipe"
2026-05-31 1:01 [PATCH 0/3] vmsplice: make vmsplice a trivial wrapper for preadv2/pwritev2 Askar Safin
@ 2026-05-31 1:01 ` Askar Safin
2026-05-31 1:01 ` [PATCH 2/3] vmsplice: make vmsplice a trivial wrapper for preadv2/pwritev2 Askar Safin
` (3 subsequent siblings)
4 siblings, 0 replies; 12+ messages in thread
From: Askar Safin @ 2026-05-31 1:01 UTC (permalink / raw)
To: linux-fsdevel, Christian Brauner, Alexander Viro, Jan Kara
Cc: linux-kernel, linux-mm, linux-api, netdev, Linus Torvalds,
Matthew Wilcox, Jens Axboe, Christoph Hellwig, David Howells,
Andrew Morton, David Hildenbrand, Pedro Falcato, Miklos Szeredi,
patches
Remove unused parameter "flags" from "link_pipe".
Signed-off-by: Askar Safin <safinaskar@gmail.com>
---
fs/splice.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/fs/splice.c b/fs/splice.c
index 9d8f63e2fd1a..59adbc2fa4d6 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -1849,7 +1849,7 @@ static int splice_pipe_to_pipe(struct pipe_inode_info *ipipe,
*/
static ssize_t link_pipe(struct pipe_inode_info *ipipe,
struct pipe_inode_info *opipe,
- size_t len, unsigned int flags)
+ size_t len)
{
struct pipe_buffer *ibuf, *obuf;
unsigned int i_head, o_head;
@@ -1962,7 +1962,7 @@ ssize_t do_tee(struct file *in, struct file *out, size_t len,
if (!ret) {
ret = opipe_prep(opipe, flags);
if (!ret)
- ret = link_pipe(ipipe, opipe, len, flags);
+ ret = link_pipe(ipipe, opipe, len);
}
}
--
2.47.3
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 2/3] vmsplice: make vmsplice a trivial wrapper for preadv2/pwritev2
2026-05-31 1:01 [PATCH 0/3] vmsplice: make vmsplice a trivial wrapper for preadv2/pwritev2 Askar Safin
2026-05-31 1:01 ` [PATCH 1/3] tee: fs/splice.c: remove unused parameter "flags" from "link_pipe" Askar Safin
@ 2026-05-31 1:01 ` Askar Safin
2026-05-31 1:01 ` [PATCH 3/3] splice: remove PIPE_BUF_FLAG_GIFT Askar Safin
` (2 subsequent siblings)
4 siblings, 0 replies; 12+ messages in thread
From: Askar Safin @ 2026-05-31 1:01 UTC (permalink / raw)
To: linux-fsdevel, Christian Brauner, Alexander Viro, Jan Kara
Cc: linux-kernel, linux-mm, linux-api, netdev, Linus Torvalds,
Matthew Wilcox, Jens Axboe, Christoph Hellwig, David Howells,
Andrew Morton, David Hildenbrand, Pedro Falcato, Miklos Szeredi,
patches
vmsplice behavior on writable pipe became equivalent to pwritev2.
vmsplice behavior on readable pipe already was nearly
equivalent to preadv2, but I made this explicit. I. e. I made it
obvious from code that vmsplice now is equivalent to preadv2/pwritev2.
Also I moved vmsplice to fs/read_write.c, because now it arguably
belongs there.
Note that SPLICE_F_NONBLOCK behavior slightly changed: previously
vmsplice ignored whether the pipe was opened with O_NONBLOCK, and mode
of operation depended on whether SPLICE_F_NONBLOCK was passed only.
Now the operation will be non-blocking if O_NONBLOCK was passed when
opening *or* SPLICE_F_NONBLOCK was passed to vmsplice. Previous
behavior was arguably buggy, and new behavior is arguably better.
Now SPLICE_F_GIFT is always ignored by all 3 syscalls: splice, tee
and vmsplice.
Signed-off-by: Askar Safin <safinaskar@gmail.com>
---
fs/read_write.c | 23 +++++
fs/splice.c | 192 +--------------------------------------
include/linux/skbuff.h | 4 +-
include/linux/splice.h | 2 +-
include/linux/syscalls.h | 4 +-
5 files changed, 29 insertions(+), 196 deletions(-)
diff --git a/fs/read_write.c b/fs/read_write.c
index 50bff7edc91f..1e5444f4dab3 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -1213,6 +1213,29 @@ SYSCALL_DEFINE6(pwritev2, unsigned long, fd, const struct iovec __user *, vec,
return do_pwritev(fd, vec, vlen, pos, flags);
}
+/*
+ * Legacy preadv2/pwritev2 wrapper.
+ */
+SYSCALL_DEFINE4(vmsplice, unsigned long, fd, const struct iovec __user *, vec,
+ unsigned long, vlen, unsigned int, flags)
+{
+ if (unlikely(flags & ~SPLICE_F_ALL))
+ return -EINVAL;
+
+ CLASS(fd, f)(fd);
+ if (fd_empty(f))
+ return -EBADF;
+
+ /* We do do_writev/do_readv, so it is okay to pass "false" here */
+ if (!get_pipe_info(fd_file(f), /* for_splice = */ false))
+ return -EBADF;
+
+ if (fd_file(f)->f_mode & FMODE_WRITE)
+ return do_writev(fd, vec, vlen, (flags & SPLICE_F_NONBLOCK) ? RWF_NOWAIT : 0);
+ else
+ return do_readv(fd, vec, vlen, (flags & SPLICE_F_NONBLOCK) ? RWF_NOWAIT : 0);
+}
+
/*
* Various compat syscalls. Note that they all pretend to take a native
* iovec - import_iovec will properly treat those as compat_iovecs based on
diff --git a/fs/splice.c b/fs/splice.c
index 59adbc2fa4d6..b1a4e3713bd6 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -159,22 +159,6 @@ const struct pipe_buf_operations page_cache_pipe_buf_ops = {
.get = generic_pipe_buf_get,
};
-static bool user_page_pipe_buf_try_steal(struct pipe_inode_info *pipe,
- struct pipe_buffer *buf)
-{
- if (!(buf->flags & PIPE_BUF_FLAG_GIFT))
- return false;
-
- buf->flags |= PIPE_BUF_FLAG_LRU;
- return generic_pipe_buf_try_steal(pipe, buf);
-}
-
-static const struct pipe_buf_operations user_page_pipe_buf_ops = {
- .release = page_cache_pipe_buf_release,
- .try_steal = user_page_pipe_buf_try_steal,
- .get = generic_pipe_buf_get,
-};
-
static void wakeup_pipe_readers(struct pipe_inode_info *pipe)
{
smp_mb();
@@ -589,8 +573,7 @@ static void splice_from_pipe_end(struct pipe_inode_info *pipe, struct splice_des
* Description:
* This function does little more than loop over the pipe and call
* @actor to do the actual moving of a single struct pipe_buffer to
- * the desired destination. See pipe_to_file, pipe_to_sendmsg, or
- * pipe_to_user.
+ * the desired destination. See pipe_to_file or pipe_to_sendmsg.
*
*/
ssize_t __splice_from_pipe(struct pipe_inode_info *pipe, struct splice_desc *sd,
@@ -1440,179 +1423,6 @@ static ssize_t __do_splice(struct file *in, loff_t __user *off_in,
return ret;
}
-static ssize_t iter_to_pipe(struct iov_iter *from,
- struct pipe_inode_info *pipe,
- unsigned int flags)
-{
- struct pipe_buffer buf = {
- .ops = &user_page_pipe_buf_ops,
- .flags = flags
- };
- size_t total = 0;
- ssize_t ret = 0;
-
- while (iov_iter_count(from)) {
- struct page *pages[16];
- ssize_t left;
- size_t start;
- int i, n;
-
- left = iov_iter_get_pages2(from, pages, ~0UL, 16, &start);
- if (left <= 0) {
- ret = left;
- break;
- }
-
- n = DIV_ROUND_UP(left + start, PAGE_SIZE);
- for (i = 0; i < n; i++) {
- int size = umin(left, PAGE_SIZE - start);
-
- buf.page = pages[i];
- buf.offset = start;
- buf.len = size;
- ret = add_to_pipe(pipe, &buf);
- if (unlikely(ret < 0)) {
- iov_iter_revert(from, left);
- // this one got dropped by add_to_pipe()
- while (++i < n)
- put_page(pages[i]);
- goto out;
- }
- total += ret;
- left -= size;
- start = 0;
- }
- }
-out:
- return total ? total : ret;
-}
-
-static int pipe_to_user(struct pipe_inode_info *pipe, struct pipe_buffer *buf,
- struct splice_desc *sd)
-{
- int n = copy_page_to_iter(buf->page, buf->offset, sd->len, sd->u.data);
- return n == sd->len ? n : -EFAULT;
-}
-
-/*
- * For lack of a better implementation, implement vmsplice() to userspace
- * as a simple copy of the pipe's pages to the user iov.
- */
-static ssize_t vmsplice_to_user(struct file *file, struct iov_iter *iter,
- unsigned int flags)
-{
- struct pipe_inode_info *pipe = get_pipe_info(file, true);
- struct splice_desc sd = {
- .total_len = iov_iter_count(iter),
- .flags = flags,
- .u.data = iter
- };
- ssize_t ret = 0;
-
- if (!pipe)
- return -EBADF;
-
- pipe_clear_nowait(file);
-
- if (sd.total_len) {
- pipe_lock(pipe);
- ret = __splice_from_pipe(pipe, &sd, pipe_to_user);
- pipe_unlock(pipe);
- }
-
- if (ret > 0)
- fsnotify_access(file);
-
- return ret;
-}
-
-/*
- * vmsplice splices a user address range into a pipe. It can be thought of
- * as splice-from-memory, where the regular splice is splice-from-file (or
- * to file). In both cases the output is a pipe, naturally.
- */
-static ssize_t vmsplice_to_pipe(struct file *file, struct iov_iter *iter,
- unsigned int flags)
-{
- struct pipe_inode_info *pipe;
- ssize_t ret = 0;
- unsigned buf_flag = 0;
-
- if (flags & SPLICE_F_GIFT)
- buf_flag = PIPE_BUF_FLAG_GIFT;
-
- pipe = get_pipe_info(file, true);
- if (!pipe)
- return -EBADF;
-
- pipe_clear_nowait(file);
-
- pipe_lock(pipe);
- ret = wait_for_space(pipe, flags);
- if (!ret)
- ret = iter_to_pipe(iter, pipe, buf_flag);
- pipe_unlock(pipe);
- if (ret > 0) {
- wakeup_pipe_readers(pipe);
- fsnotify_modify(file);
- }
- return ret;
-}
-
-/*
- * Note that vmsplice only really supports true splicing _from_ user memory
- * to a pipe, not the other way around. Splicing from user memory is a simple
- * operation that can be supported without any funky alignment restrictions
- * or nasty vm tricks. We simply map in the user memory and fill them into
- * a pipe. The reverse isn't quite as easy, though. There are two possible
- * solutions for that:
- *
- * - memcpy() the data internally, at which point we might as well just
- * do a regular read() on the buffer anyway.
- * - Lots of nasty vm tricks, that are neither fast nor flexible (it
- * has restriction limitations on both ends of the pipe).
- *
- * Currently we punt and implement it as a normal copy, see pipe_to_user().
- *
- */
-SYSCALL_DEFINE4(vmsplice, int, fd, const struct iovec __user *, uiov,
- unsigned long, nr_segs, unsigned int, flags)
-{
- struct iovec iovstack[UIO_FASTIOV];
- struct iovec *iov = iovstack;
- struct iov_iter iter;
- ssize_t error;
- int type;
-
- if (unlikely(flags & ~SPLICE_F_ALL))
- return -EINVAL;
-
- CLASS(fd, f)(fd);
- if (fd_empty(f))
- return -EBADF;
- if (fd_file(f)->f_mode & FMODE_WRITE)
- type = ITER_SOURCE;
- else if (fd_file(f)->f_mode & FMODE_READ)
- type = ITER_DEST;
- else
- return -EBADF;
-
- error = import_iovec(type, uiov, nr_segs,
- ARRAY_SIZE(iovstack), &iov, &iter);
- if (error < 0)
- return error;
-
- if (!iov_iter_count(&iter))
- error = 0;
- else if (type == ITER_SOURCE)
- error = vmsplice_to_pipe(fd_file(f), &iter, flags);
- else
- error = vmsplice_to_user(fd_file(f), &iter, flags);
-
- kfree(iov);
- return error;
-}
-
SYSCALL_DEFINE6(splice, int, fd_in, loff_t __user *, off_in,
int, fd_out, loff_t __user *, off_out,
size_t, len, unsigned int, flags)
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 2bcf78a4de7b..2961fee3e5cc 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -505,7 +505,7 @@ enum {
SKBFL_ZEROCOPY_ENABLE = BIT(0),
/* This indicates at least one fragment might be overwritten
- * (as in vmsplice(), sendfile() ...)
+ * (as in sendfile(), ...)
* If we need to compute a TX checksum, we'll need to copy
* all frags to avoid possible bad checksum
*/
@@ -4017,7 +4017,7 @@ static inline int skb_linearize(struct sk_buff *skb)
* @skb: buffer to test
*
* Return: true if the skb has at least one frag that might be modified
- * by an external entity (as in vmsplice()/sendfile())
+ * by an external entity (as in sendfile())
*/
static inline bool skb_has_shared_frag(const struct sk_buff *skb)
{
diff --git a/include/linux/splice.h b/include/linux/splice.h
index 9dec4861d09f..fb4f035aae83 100644
--- a/include/linux/splice.h
+++ b/include/linux/splice.h
@@ -19,7 +19,7 @@
/* we may still block on the fd we splice */
/* from/to, of course */
#define SPLICE_F_MORE (0x04) /* expect more data */
-#define SPLICE_F_GIFT (0x08) /* pages passed in are a gift */
+#define SPLICE_F_GIFT (0x08) /* ignored */
#define SPLICE_F_ALL (SPLICE_F_MOVE|SPLICE_F_NONBLOCK|SPLICE_F_MORE|SPLICE_F_GIFT)
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index f5639d5ac331..a86a88207956 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -514,8 +514,8 @@ asmlinkage long sys_ppoll_time32(struct pollfd __user *, unsigned int,
struct old_timespec32 __user *, const sigset_t __user *,
size_t);
asmlinkage long sys_signalfd4(int ufd, sigset_t __user *user_mask, size_t sizemask, int flags);
-asmlinkage long sys_vmsplice(int fd, const struct iovec __user *iov,
- unsigned long nr_segs, unsigned int flags);
+asmlinkage long sys_vmsplice(unsigned long fd, const struct iovec __user *vec,
+ unsigned long vlen, unsigned int flags);
asmlinkage long sys_splice(int fd_in, loff_t __user *off_in,
int fd_out, loff_t __user *off_out,
size_t len, unsigned int flags);
--
2.47.3
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 3/3] splice: remove PIPE_BUF_FLAG_GIFT
2026-05-31 1:01 [PATCH 0/3] vmsplice: make vmsplice a trivial wrapper for preadv2/pwritev2 Askar Safin
2026-05-31 1:01 ` [PATCH 1/3] tee: fs/splice.c: remove unused parameter "flags" from "link_pipe" Askar Safin
2026-05-31 1:01 ` [PATCH 2/3] vmsplice: make vmsplice a trivial wrapper for preadv2/pwritev2 Askar Safin
@ 2026-05-31 1:01 ` Askar Safin
2026-05-31 8:54 ` [PATCH 0/3] vmsplice: make vmsplice a trivial wrapper for preadv2/pwritev2 Pedro Falcato
2026-06-01 3:11 ` Andy Lutomirski
4 siblings, 0 replies; 12+ messages in thread
From: Askar Safin @ 2026-05-31 1:01 UTC (permalink / raw)
To: linux-fsdevel, Christian Brauner, Alexander Viro, Jan Kara
Cc: linux-kernel, linux-mm, linux-api, netdev, Linus Torvalds,
Matthew Wilcox, Jens Axboe, Christoph Hellwig, David Howells,
Andrew Morton, David Hildenbrand, Pedro Falcato, Miklos Szeredi,
patches
It is unused now.
Signed-off-by: Askar Safin <safinaskar@gmail.com>
---
fs/fuse/dev.c | 1 -
fs/splice.c | 6 ++----
include/linux/pipe_fs_i.h | 1 -
3 files changed, 2 insertions(+), 6 deletions(-)
diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
index 5dda7080f4a9..fb8fe0c96692 100644
--- a/fs/fuse/dev.c
+++ b/fs/fuse/dev.c
@@ -2352,7 +2352,6 @@ static ssize_t fuse_dev_splice_write(struct pipe_inode_info *pipe,
goto out_free;
*obuf = *ibuf;
- obuf->flags &= ~PIPE_BUF_FLAG_GIFT;
obuf->len = rem;
ibuf->offset += obuf->len;
ibuf->len -= obuf->len;
diff --git a/fs/splice.c b/fs/splice.c
index b1a4e3713bd6..6ddf7dd72f7b 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -1622,10 +1622,9 @@ static int splice_pipe_to_pipe(struct pipe_inode_info *ipipe,
*obuf = *ibuf;
/*
- * Don't inherit the gift and merge flags, we need to
+ * Don't inherit the merge flag, we need to
* prevent multiple steals of this page.
*/
- obuf->flags &= ~PIPE_BUF_FLAG_GIFT;
obuf->flags &= ~PIPE_BUF_FLAG_CAN_MERGE;
obuf->len = len;
@@ -1711,10 +1710,9 @@ static ssize_t link_pipe(struct pipe_inode_info *ipipe,
*obuf = *ibuf;
/*
- * Don't inherit the gift and merge flag, we need to prevent
+ * Don't inherit the merge flag, we need to prevent
* multiple steals of this page.
*/
- obuf->flags &= ~PIPE_BUF_FLAG_GIFT;
obuf->flags &= ~PIPE_BUF_FLAG_CAN_MERGE;
if (obuf->len > len)
diff --git a/include/linux/pipe_fs_i.h b/include/linux/pipe_fs_i.h
index 7f6a92ac9704..a1eeed800669 100644
--- a/include/linux/pipe_fs_i.h
+++ b/include/linux/pipe_fs_i.h
@@ -6,7 +6,6 @@
#define PIPE_BUF_FLAG_LRU 0x01 /* page is on the LRU */
#define PIPE_BUF_FLAG_ATOMIC 0x02 /* was atomically mapped */
-#define PIPE_BUF_FLAG_GIFT 0x04 /* page is a gift */
#define PIPE_BUF_FLAG_PACKET 0x08 /* read() as a packet */
#define PIPE_BUF_FLAG_CAN_MERGE 0x10 /* can merge buffers */
#define PIPE_BUF_FLAG_WHOLE 0x20 /* read() must return entire buffer or error */
--
2.47.3
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 0/3] vmsplice: make vmsplice a trivial wrapper for preadv2/pwritev2
2026-05-31 1:01 [PATCH 0/3] vmsplice: make vmsplice a trivial wrapper for preadv2/pwritev2 Askar Safin
` (2 preceding siblings ...)
2026-05-31 1:01 ` [PATCH 3/3] splice: remove PIPE_BUF_FLAG_GIFT Askar Safin
@ 2026-05-31 8:54 ` Pedro Falcato
2026-05-31 21:21 ` Askar Safin
2026-06-01 3:11 ` Andy Lutomirski
4 siblings, 1 reply; 12+ messages in thread
From: Pedro Falcato @ 2026-05-31 8:54 UTC (permalink / raw)
To: Askar Safin
Cc: linux-fsdevel, Christian Brauner, Alexander Viro, Jan Kara,
linux-kernel, linux-mm, linux-api, netdev, Linus Torvalds,
Matthew Wilcox, Jens Axboe, Christoph Hellwig, David Howells,
Andrew Morton, David Hildenbrand, Miklos Szeredi, patches
On Sun, May 31, 2026 at 01:01:04AM +0000, Askar Safin wrote:
> This patchset is for VFS.
>
> Recently we got a lot of vulnerabilities in splice/vmsplice.
>
> Also vmsplice already was source of vulnerabilities in the past:
> CVE-2020-29374 (see https://lwn.net/Articles/849638/ ).
>
> Also vmsplice is problematic for other reasons. Here is what other
> developers say:
>
> Linus Torvalds in 2023:
> > So I'd personally be perfectly ok with just making vmsplice() be
> > exactly the same as write, and turn all of vmsplice() into just "it's
> > a read() if the pipe is open for read, and a write if it's open for
> > writing".
> https://lore.kernel.org/all/CAHk-=wgG_2cmHgZwKjydi7=iimyHyN8aessnbM9XQ9ufbaUz9g@mail.gmail.com/
>
> Christoph Hellwig in May 2026:
> > vmsplice is the worst, as it is one of the few remaining places that
> > can incorrectly dirty file backed pages without telling the file system
> > and cause the other problems fixed by a FOLL_PIN conversion, but it is
> > the only one where we do not have any idea yet how we could convert it
> > to FOLL_PIN due to the unbounded pin time.
> https://lore.kernel.org/all/agwFlBKvKytjURDO@infradead.org/
>
> See recent discussion here:
> https://lore.kernel.org/all/20260516182126.530498-1-pfalcato@suse.de/T/#u
So, you took an ongoing discussion with an ongoing RFC patchset, and you
decided to reimplement part of the idea on your own, as a concurrent patchset.
Riiiiiight.... I don't think I have to NAK this, do I?
>
> For all these reasons I propose to make vmsplice a simple wrapper for
> preadv2/pwritev2.
>
> vmsplice(fd, vec, vlen, vmsplice_flags) will
> be equivalent to preadv2(fd, vec, vlen, -1, rw_flags) if you have
> readable pipe and to pwritev2(fd, vec, vlen, -1, rw_flags) if you have
> writable pipe.
This does not work. https://codesearch.debian.net/search?q=vmsplice%28&literal=1
There are users.
--
Pedro
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/3] vmsplice: make vmsplice a trivial wrapper for preadv2/pwritev2
2026-05-31 8:54 ` [PATCH 0/3] vmsplice: make vmsplice a trivial wrapper for preadv2/pwritev2 Pedro Falcato
@ 2026-05-31 21:21 ` Askar Safin
2026-06-01 16:16 ` Christian Brauner
0 siblings, 1 reply; 12+ messages in thread
From: Askar Safin @ 2026-05-31 21:21 UTC (permalink / raw)
To: Pedro Falcato
Cc: linux-fsdevel, Christian Brauner, Alexander Viro, Jan Kara,
linux-kernel, linux-mm, linux-api, netdev, Linus Torvalds,
Matthew Wilcox, Jens Axboe, Christoph Hellwig, David Howells,
Andrew Morton, David Hildenbrand, Miklos Szeredi, patches
On Sun, May 31, 2026 at 11:54 AM Pedro Falcato <pfalcato@suse.de> wrote:
> So, you took an ongoing discussion with an ongoing RFC patchset, and you
> decided to reimplement part of the idea on your own, as a concurrent patchset.
Yes. But I propose an alternative solution to this problem.
Brauner said in discussion for your patchset:
"So I'm not very likely to pick this up as is".
So, I decided to submit another solution.
Pedro, I'm not trying to insult you.
Other kernel developers will decide which of these two solutions they like more.
Many people in discussion of your patchset said how they
dislike splice/vmsplice, and especially vmsplice.
Hellwig said "vmsplice is the worst".
Brauner, Hellwig, Horn said that they dislike vmsplice.
They said that vmsplice in its current form should not
be used, and that it is broken.
Despite all these problems nobody managed to fix
vmsplice in all these years.
So I propose just to effectively remove it.
You may think that I just saw a recent discussion and decided
to jump in. No. splice/vmsplice is my topic of interest for many
years. You can verify this by searching "f:Askar splice"
on lore.kernel.org . I simply decided that given
recent vulnerabilities now is the perfect time to solve
all these vmsplice problems once and for all.
I explained my position here:
https://lore.kernel.org/all/20260523204100.553125-1-safinaskar@gmail.com/ .
Nobody answered, so I just posted this patchset.
If my patchset is applied, then I will try to deal
with splice-pagecache-to-pipe somehow,
probably by removing it, too. :) I decided first
to deal with vmsplice, because it seems to be
easier problem.
> > vmsplice(fd, vec, vlen, vmsplice_flags) will
> > be equivalent to preadv2(fd, vec, vlen, -1, rw_flags) if you have
> > readable pipe and to pwritev2(fd, vec, vlen, -1, rw_flags) if you have
> > writable pipe.
>
> This does not work. https://codesearch.debian.net/search?q=vmsplice%28&literal=1
> There are users.
Yes, they are. But my solution is compatible. vmsplice is simply performance
optimization. vmsplice will work just as before, but slower.
And, most importantly, vmsplice design problems will be gone
(nobody managed to fix them anyway for all these years).
--
Askar Safin
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/3] vmsplice: make vmsplice a trivial wrapper for preadv2/pwritev2
2026-05-31 1:01 [PATCH 0/3] vmsplice: make vmsplice a trivial wrapper for preadv2/pwritev2 Askar Safin
` (3 preceding siblings ...)
2026-05-31 8:54 ` [PATCH 0/3] vmsplice: make vmsplice a trivial wrapper for preadv2/pwritev2 Pedro Falcato
@ 2026-06-01 3:11 ` Andy Lutomirski
2026-06-01 15:36 ` Matthew Wilcox
4 siblings, 1 reply; 12+ messages in thread
From: Andy Lutomirski @ 2026-06-01 3:11 UTC (permalink / raw)
To: Askar Safin
Cc: linux-fsdevel, Christian Brauner, Alexander Viro, Jan Kara,
linux-kernel, linux-mm, linux-api, netdev, Linus Torvalds,
Matthew Wilcox, Jens Axboe, Christoph Hellwig, David Howells,
Andrew Morton, David Hildenbrand, Pedro Falcato, Miklos Szeredi,
patches
On Sat, May 30, 2026 at 6:03 PM Askar Safin <safinaskar@gmail.com> wrote:
>
> See recent discussion here:
> https://lore.kernel.org/all/20260516182126.530498-1-pfalcato@suse.de/T/#u
>
> For all these reasons I propose to make vmsplice a simple wrapper for
> preadv2/pwritev2.
>
I have no comment on the code or the history. But I'm 100% in favor
of the solution. vmsplice is a crappy API, and would be incredibly
complex to get the implementation right, and it should be removed.
But it has users, and the approach of just mapping them straight to
pread/pwrite makes perfect sense.
(If anyone wants to contemplate how bad the API is, contemplate gift
mode. Or contemplate that, if you want correct results, you need to
avoid modifying the memory until the recipient is done reading or you
need to avoid reading the memory until the writer is done writing, and
vmsplice *does not tell you when it's done*. And there isn't even a
caller specification of whether they want to read or write. It's ...
crap.)
--Andy
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/3] vmsplice: make vmsplice a trivial wrapper for preadv2/pwritev2
2026-06-01 3:11 ` Andy Lutomirski
@ 2026-06-01 15:36 ` Matthew Wilcox
2026-06-01 15:50 ` Linus Torvalds
0 siblings, 1 reply; 12+ messages in thread
From: Matthew Wilcox @ 2026-06-01 15:36 UTC (permalink / raw)
To: Andy Lutomirski
Cc: Askar Safin, linux-fsdevel, Christian Brauner, Alexander Viro,
Jan Kara, linux-kernel, linux-mm, linux-api, netdev,
Linus Torvalds, Jens Axboe, Christoph Hellwig, David Howells,
Andrew Morton, David Hildenbrand, Pedro Falcato, Miklos Szeredi,
patches
On Sun, May 31, 2026 at 08:11:34PM -0700, Andy Lutomirski wrote:
> On Sat, May 30, 2026 at 6:03 PM Askar Safin <safinaskar@gmail.com> wrote:
> >
> > See recent discussion here:
> > https://lore.kernel.org/all/20260516182126.530498-1-pfalcato@suse.de/T/#u
> >
> > For all these reasons I propose to make vmsplice a simple wrapper for
> > preadv2/pwritev2.
> >
>
> I have no comment on the code or the history. But I'm 100% in favor
> of the solution. vmsplice is a crappy API, and would be incredibly
> complex to get the implementation right, and it should be removed.
> But it has users, and the approach of just mapping them straight to
> pread/pwrite makes perfect sense.
I agree with Andy. I think it was appropriate to send this series, since
(as far as I can tell) it's a completely different approach from the others
taken. I'm not really qualified to judge whether the implementation is
good (it's a bit outside my competency as a reviewer), but the described
approach is more convincing to me than the other approaches.
Can we review this series properly?
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/3] vmsplice: make vmsplice a trivial wrapper for preadv2/pwritev2
2026-06-01 15:36 ` Matthew Wilcox
@ 2026-06-01 15:50 ` Linus Torvalds
2026-06-01 16:17 ` Christian Brauner
0 siblings, 1 reply; 12+ messages in thread
From: Linus Torvalds @ 2026-06-01 15:50 UTC (permalink / raw)
To: Matthew Wilcox
Cc: Andy Lutomirski, Askar Safin, linux-fsdevel, Christian Brauner,
Alexander Viro, Jan Kara, linux-kernel, linux-mm, linux-api,
netdev, Jens Axboe, Christoph Hellwig, David Howells,
Andrew Morton, David Hildenbrand, Pedro Falcato, Miklos Szeredi,
patches
On Mon, 1 Jun 2026 at 08:36, Matthew Wilcox <willy@infradead.org> wrote:
>
> Can we review this series properly?
Well, since it pretty much is what I suggested a few years ago, I
certainly won't NAK it.
And the patches looked very straightforward to me. Just the final
diffstat is worth quoting again because that certainly doesn't look
problematic:
7 files changed, 33 insertions(+), 204 deletions(-)
and it removes that GIFT flag that was truly disgusting.
So I'm certainly ok with it from a "looking at the patch" standpoint.
I didn't _test_ it. I don't have any workload that might remotely
care.
I did a quick scan on debian code search for vmsplice, and after ten
pages of entries that weren't actually *using* it but had lists of
system calls, I grew bored. So there are likely users, but I don't
know what they are and how much they care. It *might* be a big
performance issue somewhere. Unlikely, but...
Linus
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/3] vmsplice: make vmsplice a trivial wrapper for preadv2/pwritev2
2026-05-31 21:21 ` Askar Safin
@ 2026-06-01 16:16 ` Christian Brauner
0 siblings, 0 replies; 12+ messages in thread
From: Christian Brauner @ 2026-06-01 16:16 UTC (permalink / raw)
To: Askar Safin
Cc: Pedro Falcato, linux-fsdevel, Alexander Viro, Jan Kara,
linux-kernel, linux-mm, linux-api, netdev, Linus Torvalds,
Matthew Wilcox, Jens Axboe, Christoph Hellwig, David Howells,
Andrew Morton, David Hildenbrand, Miklos Szeredi, patches
On Mon, Jun 01, 2026 at 12:21:06AM +0300, Askar Safin wrote:
> On Sun, May 31, 2026 at 11:54 AM Pedro Falcato <pfalcato@suse.de> wrote:
> > So, you took an ongoing discussion with an ongoing RFC patchset, and you
> > decided to reimplement part of the idea on your own, as a concurrent patchset.
>
> Yes. But I propose an alternative solution to this problem.
So I think this is a case where no explicit rules have been broken. But
if you know that someone has been posting patches and is working on a
problem just racing them to get your own stuff merged is very likely to
unnecessarily ruffle feathers. So sync with the person next time.
The discussion wasn't at an impasse and Pedro is expected to follow-up.
It's not very nice to just have someone else's work be for naught.
> Brauner said in discussion for your patchset:
> "So I'm not very likely to pick this up as is".
> So, I decided to submit another solution.
This lacks quite some context... I said "in its current form" and the a
long discussion ensued.
> If my patchset is applied, then I will try to deal
> with splice-pagecache-to-pipe somehow,
> probably by removing it, too. :) I decided first
So ok, but this is literally what Pedro is working on. This just wastes
people's time.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/3] vmsplice: make vmsplice a trivial wrapper for preadv2/pwritev2
2026-06-01 15:50 ` Linus Torvalds
@ 2026-06-01 16:17 ` Christian Brauner
2026-06-01 16:22 ` Linus Torvalds
0 siblings, 1 reply; 12+ messages in thread
From: Christian Brauner @ 2026-06-01 16:17 UTC (permalink / raw)
To: Linus Torvalds
Cc: Matthew Wilcox, Andy Lutomirski, Askar Safin, linux-fsdevel,
Alexander Viro, Jan Kara, linux-kernel, linux-mm, linux-api,
netdev, Jens Axboe, Christoph Hellwig, David Howells,
Andrew Morton, David Hildenbrand, Pedro Falcato, Miklos Szeredi,
patches
On Mon, Jun 01, 2026 at 08:50:00AM -0700, Linus Torvalds wrote:
> On Mon, 1 Jun 2026 at 08:36, Matthew Wilcox <willy@infradead.org> wrote:
> >
> > Can we review this series properly?
>
> Well, since it pretty much is what I suggested a few years ago, I
> certainly won't NAK it.
>
> And the patches looked very straightforward to me. Just the final
> diffstat is worth quoting again because that certainly doesn't look
> problematic:
>
> 7 files changed, 33 insertions(+), 204 deletions(-)
>
> and it removes that GIFT flag that was truly disgusting.
>
> So I'm certainly ok with it from a "looking at the patch" standpoint.
> I didn't _test_ it. I don't have any workload that might remotely
> care.
>
> I did a quick scan on debian code search for vmsplice, and after ten
> pages of entries that weren't actually *using* it but had lists of
> system calls, I grew bored. So there are likely users, but I don't
> know what they are and how much they care. It *might* be a big
> performance issue somewhere. Unlikely, but...
As usual I would argue to accept it and revert in case we get actual
regression reports...
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 0/3] vmsplice: make vmsplice a trivial wrapper for preadv2/pwritev2
2026-06-01 16:17 ` Christian Brauner
@ 2026-06-01 16:22 ` Linus Torvalds
0 siblings, 0 replies; 12+ messages in thread
From: Linus Torvalds @ 2026-06-01 16:22 UTC (permalink / raw)
To: Christian Brauner
Cc: Matthew Wilcox, Andy Lutomirski, Askar Safin, linux-fsdevel,
Alexander Viro, Jan Kara, linux-kernel, linux-mm, linux-api,
netdev, Jens Axboe, Christoph Hellwig, David Howells,
Andrew Morton, David Hildenbrand, Pedro Falcato, Miklos Szeredi,
patches
On Mon, 1 Jun 2026 at 09:17, Christian Brauner <brauner@kernel.org> wrote:
>
> As usual I would argue to accept it and revert in case we get actual
> regression reports...
Yes, likely the only way we'd ever find out ..
Linus
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2026-06-01 16:23 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-31 1:01 [PATCH 0/3] vmsplice: make vmsplice a trivial wrapper for preadv2/pwritev2 Askar Safin
2026-05-31 1:01 ` [PATCH 1/3] tee: fs/splice.c: remove unused parameter "flags" from "link_pipe" Askar Safin
2026-05-31 1:01 ` [PATCH 2/3] vmsplice: make vmsplice a trivial wrapper for preadv2/pwritev2 Askar Safin
2026-05-31 1:01 ` [PATCH 3/3] splice: remove PIPE_BUF_FLAG_GIFT Askar Safin
2026-05-31 8:54 ` [PATCH 0/3] vmsplice: make vmsplice a trivial wrapper for preadv2/pwritev2 Pedro Falcato
2026-05-31 21:21 ` Askar Safin
2026-06-01 16:16 ` Christian Brauner
2026-06-01 3:11 ` Andy Lutomirski
2026-06-01 15:36 ` Matthew Wilcox
2026-06-01 15:50 ` Linus Torvalds
2026-06-01 16:17 ` Christian Brauner
2026-06-01 16:22 ` Linus Torvalds
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox