* Re: [PATCH net 1/1] net: smc: fix splice entry lifetime imbalance in smc_rx_splice
2026-06-10 17:54 ` [PATCH net 1/1] net: smc: fix splice entry lifetime imbalance in smc_rx_splice Ren Wei
@ 2026-06-11 17:54 ` sashiko-bot
2026-06-15 19:49 ` Jakub Kicinski
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: sashiko-bot @ 2026-06-11 17:54 UTC (permalink / raw)
To: Ren Wei; +Cc: Alexander Gordeev, Heiko Carstens, Vasily Gorbik, linux-s390
Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
Pre-existing issues:
- [High] Silent data loss and socket state desynchronization on partial splice due to ignored short return value.
- [High] Legitimate negative error codes from smc_rx_splice() are incorrectly masked as -EFAULT.
--
commit 2f35c4cc130c3fdbc834b04e4b425109926602ac
Author: Daming Li <d4n.for.sec@gmail.com>
net: smc: fix splice entry lifetime imbalance in smc_rx_splice
This commit fixes a splice entry lifetime imbalance in smc_rx_splice() by
taking page and socket references for every candidate splice entry before
calling splice_to_pipe(). Unqueued entries are properly released via
smc_rx_spd_release().
> diff --git a/net/smc/smc_rx.c b/net/smc/smc_rx.c
> index c1d9b923938da..88aee0d935977 100644
> --- a/net/smc/smc_rx.c
> +++ b/net/smc/smc_rx.c
[ ... ]
> @@ -217,16 +226,8 @@ static int smc_rx_splice(struct pipe_inode_info *pipe, char *src, size_t len,
> spd.spd_release = smc_rx_spd_release;
>
> bytes = splice_to_pipe(pipe, &spd);
[Severity: High]
This isn't a bug introduced by this patch, but this preexisting issue seems
to cause silent data loss on partial splices.
When smc_rx_splice() queues data into a pipe, it returns the actual number of
bytes successfully spliced. If the pipe nears capacity, this can be a short
count (0 < rc < chunk_len).
However, looking at its caller smc_rx_recvmsg():
net/smc/smc_rx.c:smc_rx_recvmsg() {
...
} else {
rc = smc_rx_splice(pipe, rcvbuf_base +
chunk_off, chunk_len,
smc);
}
...
read_remaining -= chunk_len;
read_done += chunk_len;
...
atomic_sub(copylen, &conn->bytes_to_rcv);
...
}
Does this code completely ignore the actual bytes spliced (rc) and
unconditionally assume the full chunk_len was processed?
Since it subtracts the full copylen from conn->bytes_to_rcv, wouldn't this
permanently desynchronize the socket's internal receive buffer accounting
and cause silent data loss?
[Severity: High]
This is a pre-existing issue, but legitimate negative error codes from
smc_rx_splice() seem to be incorrectly masked as -EFAULT.
If smc_rx_splice() returns a negative system error like -EAGAIN or -ENOMEM,
the caller smc_rx_recvmsg() intercepts it:
net/smc/smc_rx.c:smc_rx_recvmsg() {
...
} else {
rc = smc_rx_splice(pipe, rcvbuf_base +
chunk_off, chunk_len,
smc);
}
if (rc < 0) {
if (!read_done)
read_done = -EFAULT;
goto out;
}
...
}
By overriding retryable conditions with a fatal memory error, wouldn't this
cause userspace applications relying on splice() and non-blocking sockets
to unexpectedly abort connections?
> - if (bytes > 0) {
> - sock_hold(&smc->sk);
> - if (!lgr->is_smcd && smc->conn.rmb_desc->is_vm) {
> - for (i = 0; i < PAGE_ALIGN(bytes + offset) / PAGE_SIZE; i++)
> - get_page(pages[i]);
> - } else {
> - get_page(smc->conn.rmb_desc->pages);
> - }
> + if (bytes > 0)
> atomic_add(bytes, &smc->conn.splice_pending);
> - }
> kfree(priv);
> kfree(partial);
> kfree(pages);
--
Sashiko AI review · https://sashiko.dev/#/patchset/192d1b44ed358ca143f44ef167d14153bccc51e9.1781097957.git.d4n.for.sec@gmail.com?part=1
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [PATCH net 1/1] net: smc: fix splice entry lifetime imbalance in smc_rx_splice
2026-06-10 17:54 ` [PATCH net 1/1] net: smc: fix splice entry lifetime imbalance in smc_rx_splice Ren Wei
2026-06-11 17:54 ` sashiko-bot
@ 2026-06-15 19:49 ` Jakub Kicinski
2026-06-16 9:30 ` Dust Li
2026-06-16 14:27 ` Sidraya Jayagond
3 siblings, 0 replies; 5+ messages in thread
From: Jakub Kicinski @ 2026-06-15 19:49 UTC (permalink / raw)
To: Ren Wei
Cc: linux-rdma, linux-s390, netdev, alibuda, dust.li, sidraya, wenjia,
mjambigi, tonylu, guwen, ubraun, stefan.raspl, davem, yuantan098,
zcliangcn, bird, lx24, d4n.for.sec
On Thu, 11 Jun 2026 01:54:11 +0800 Ren Wei wrote:
> smc_rx_splice() hands candidate pages to splice_to_pipe() without taking
> references for the lifetime of each splice entry first. That breaks the
> splice ownership contract in the VM-backed RMB path.
>
> splice_to_pipe() drops unqueued entries through spd_release(), while
> queued entries are later dropped through the pipe buffer release
> callback. The current code only tries to take page references after the
> splice succeeds, and it derives the number of queued VM pages from a
> mutated offset value. This can underflow page refcounts and trigger a
> use-after-free. It also leaves the socket lifetime imbalanced in the
> multi-page VM case, where one sock_hold() can be followed by multiple
> sock_put() calls.
>
> Fix this by taking the page and socket references for every candidate
> splice entry before calling splice_to_pipe(), and by releasing the
> matching private state, page reference, and socket reference from
> smc_rx_spd_release() for entries that never get queued. This makes the
> SMC splice path follow the normal splice lifetime rules and removes the
> broken post-splice VM page counting entirely.
SMC maintainers please review.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH net 1/1] net: smc: fix splice entry lifetime imbalance in smc_rx_splice
2026-06-10 17:54 ` [PATCH net 1/1] net: smc: fix splice entry lifetime imbalance in smc_rx_splice Ren Wei
2026-06-11 17:54 ` sashiko-bot
2026-06-15 19:49 ` Jakub Kicinski
@ 2026-06-16 9:30 ` Dust Li
2026-06-16 14:27 ` Sidraya Jayagond
3 siblings, 0 replies; 5+ messages in thread
From: Dust Li @ 2026-06-16 9:30 UTC (permalink / raw)
To: Ren Wei, linux-rdma, linux-s390, netdev
Cc: alibuda, sidraya, wenjia, mjambigi, tonylu, guwen, ubraun,
stefan.raspl, davem, yuantan098, zcliangcn, bird, lx24,
d4n.for.sec
On 2026-06-11 01:54:11, Ren Wei wrote:
>From: Daming Li <d4n.for.sec@gmail.com>
>
>smc_rx_splice() hands candidate pages to splice_to_pipe() without taking
>references for the lifetime of each splice entry first. That breaks the
>splice ownership contract in the VM-backed RMB path.
>
>splice_to_pipe() drops unqueued entries through spd_release(), while
>queued entries are later dropped through the pipe buffer release
>callback. The current code only tries to take page references after the
>splice succeeds, and it derives the number of queued VM pages from a
>mutated offset value. This can underflow page refcounts and trigger a
>use-after-free. It also leaves the socket lifetime imbalanced in the
>multi-page VM case, where one sock_hold() can be followed by multiple
>sock_put() calls.
>
>Fix this by taking the page and socket references for every candidate
>splice entry before calling splice_to_pipe(), and by releasing the
>matching private state, page reference, and socket reference from
>smc_rx_spd_release() for entries that never get queued. This makes the
>SMC splice path follow the normal splice lifetime rules and removes the
>broken post-splice VM page counting entirely.
>
>Fixes: 9014db202cb7 ("smc: add support for splice()")
>Cc: stable@vger.kernel.org
>Reported-by: Yuan Tan <yuantan098@gmail.com>
>Reported-by: Zhengchuan Liang <zcliangcn@gmail.com>
>Reported-by: Xin Liu <bird@lzu.edu.cn>
>Assisted-by: Codex:GPT-5.4
>Co-developed-by: Liu Xiao <lx24@stu.ynu.edu.cn>
>Signed-off-by: Liu Xiao <lx24@stu.ynu.edu.cn>
>Signed-off-by: Daming Li <d4n.for.sec@gmail.com>
>Signed-off-by: Ren Wei <n05ec@lzu.edu.cn>
The patch looks good to me, a minor nit below
Reviewed-by: Dust Li <dust.li@linux.alibaba.com>
>---
> net/smc/smc_rx.c | 21 +++++++++++----------
> 1 file changed, 11 insertions(+), 10 deletions(-)
>
>diff --git a/net/smc/smc_rx.c b/net/smc/smc_rx.c
>index c1d9b923938d..88aee0d93597 100644
>--- a/net/smc/smc_rx.c
>+++ b/net/smc/smc_rx.c
>@@ -150,18 +150,23 @@ static const struct pipe_buf_operations smc_pipe_ops = {
> static void smc_rx_spd_release(struct splice_pipe_desc *spd,
> unsigned int i)
> {
>+ struct smc_spd_priv *priv = (struct smc_spd_priv *)spd->partial[i].private;
>+ struct sock *sk = &priv->smc->sk;
>+
>+ kfree(priv);
> put_page(spd->pages[i]);
>+ sock_put(sk);
> }
>
> static int smc_rx_splice(struct pipe_inode_info *pipe, char *src, size_t len,
> struct smc_sock *smc)
> {
> struct smc_link_group *lgr = smc->conn.lgr;
>- int offset = offset_in_page(src);
> struct partial_page *partial;
> struct splice_pipe_desc spd;
> struct smc_spd_priv **priv;
> struct page **pages;
>+ int offset = offset_in_page(src);
Minor nit:
moving int offset = offset_in_page(src) down breaks the existing
reverse-xmas-tree declaration ordering. We keep this style in SMC.
Best regards,
Dust
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [PATCH net 1/1] net: smc: fix splice entry lifetime imbalance in smc_rx_splice
2026-06-10 17:54 ` [PATCH net 1/1] net: smc: fix splice entry lifetime imbalance in smc_rx_splice Ren Wei
` (2 preceding siblings ...)
2026-06-16 9:30 ` Dust Li
@ 2026-06-16 14:27 ` Sidraya Jayagond
3 siblings, 0 replies; 5+ messages in thread
From: Sidraya Jayagond @ 2026-06-16 14:27 UTC (permalink / raw)
To: Ren Wei, linux-rdma, linux-s390, netdev
Cc: alibuda, dust.li, wenjia, mjambigi, tonylu, guwen, ubraun,
stefan.raspl, davem, yuantan098, zcliangcn, bird, lx24,
d4n.for.sec
On 10/06/26 11:24 pm, Ren Wei wrote:
> From: Daming Li <d4n.for.sec@gmail.com>
>
> smc_rx_splice() hands candidate pages to splice_to_pipe() without taking
> references for the lifetime of each splice entry first. That breaks the
> splice ownership contract in the VM-backed RMB path.
>
> splice_to_pipe() drops unqueued entries through spd_release(), while
> queued entries are later dropped through the pipe buffer release
> callback. The current code only tries to take page references after the
> splice succeeds, and it derives the number of queued VM pages from a
> mutated offset value. This can underflow page refcounts and trigger a
> use-after-free. It also leaves the socket lifetime imbalanced in the
> multi-page VM case, where one sock_hold() can be followed by multiple
> sock_put() calls.
>
> Fix this by taking the page and socket references for every candidate
> splice entry before calling splice_to_pipe(), and by releasing the
> matching private state, page reference, and socket reference from
> smc_rx_spd_release() for entries that never get queued. This makes the
> SMC splice path follow the normal splice lifetime rules and removes the
> broken post-splice VM page counting entirely.
>
> Fixes: 9014db202cb7 ("smc: add support for splice()")
> Cc: stable@vger.kernel.org
> Reported-by: Yuan Tan <yuantan098@gmail.com>
> Reported-by: Zhengchuan Liang <zcliangcn@gmail.com>
> Reported-by: Xin Liu <bird@lzu.edu.cn>
> Assisted-by: Codex:GPT-5.4
> Co-developed-by: Liu Xiao <lx24@stu.ynu.edu.cn>
> Signed-off-by: Liu Xiao <lx24@stu.ynu.edu.cn>
> Signed-off-by: Daming Li <d4n.for.sec@gmail.com>
> Signed-off-by: Ren Wei <n05ec@lzu.edu.cn>
> ---
> net/smc/smc_rx.c | 21 +++++++++++----------
> 1 file changed, 11 insertions(+), 10 deletions(-)
>
> diff --git a/net/smc/smc_rx.c b/net/smc/smc_rx.c
> index c1d9b923938d..88aee0d93597 100644
> --- a/net/smc/smc_rx.c
> +++ b/net/smc/smc_rx.c
> @@ -150,18 +150,23 @@ static const struct pipe_buf_operations smc_pipe_ops = {
> static void smc_rx_spd_release(struct splice_pipe_desc *spd,
> unsigned int i)
> {
> + struct smc_spd_priv *priv = (struct smc_spd_priv *)spd->partial[i].private;
> + struct sock *sk = &priv->smc->sk;
> +
> + kfree(priv);
> put_page(spd->pages[i]);
> + sock_put(sk);
> }
>
> static int smc_rx_splice(struct pipe_inode_info *pipe, char *src, size_t len,
> struct smc_sock *smc)
> {
> struct smc_link_group *lgr = smc->conn.lgr;
> - int offset = offset_in_page(src);
> struct partial_page *partial;
> struct splice_pipe_desc spd;
> struct smc_spd_priv **priv;
> struct page **pages;
> + int offset = offset_in_page(src);
> int bytes, nr_pages;
> int i;
>
> @@ -209,6 +214,10 @@ static int smc_rx_splice(struct pipe_inode_info *pipe, char *src, size_t len,
> offset = 0;
> }
> }
> + for (i = 0; i < nr_pages; i++) {
> + get_page(pages[i]);
> + sock_hold(&smc->sk);
> + }
> spd.nr_pages_max = nr_pages;
> spd.nr_pages = nr_pages;
> spd.pages = pages;
> @@ -217,16 +226,8 @@ static int smc_rx_splice(struct pipe_inode_info *pipe, char *src, size_t len,
> spd.spd_release = smc_rx_spd_release;
>
> bytes = splice_to_pipe(pipe, &spd);
> - if (bytes > 0) {
> - sock_hold(&smc->sk);
> - if (!lgr->is_smcd && smc->conn.rmb_desc->is_vm) {
> - for (i = 0; i < PAGE_ALIGN(bytes + offset) / PAGE_SIZE; i++)
> - get_page(pages[i]);
> - } else {
> - get_page(smc->conn.rmb_desc->pages);
> - }
> + if (bytes > 0)
> atomic_add(bytes, &smc->conn.splice_pending);
> - }
> kfree(priv);
> kfree(partial);
> kfree(pages);
Code changes looks good to me.
Reviewed-by: Sidraya Jayagond <sidraya@linux.ibm.com>
^ permalink raw reply [flat|nested] 5+ messages in thread