* [PATCH] RDMA/siw: Fix the sendmsg byte count in siw_tcp_sendpages
@ 2025-07-23 10:41 Pedro Falcato
2025-07-23 14:52 ` Bernard Metzler
0 siblings, 1 reply; 6+ messages in thread
From: Pedro Falcato @ 2025-07-23 10:41 UTC (permalink / raw)
To: Jason Gunthorpe, Bernard Metzler, Leon Romanovsky,
Vlastimil Babka
Cc: Jakub Kicinski, David Howells, Tom Talpey, linux-rdma,
linux-kernel, linux-mm, Pedro Falcato, stable, kernel test robot
Ever since commit c2ff29e99a76 ("siw: Inline do_tcp_sendpages()"),
we have been doing this:
static int siw_tcp_sendpages(struct socket *s, struct page **page, int offset,
size_t size)
[...]
/* Calculate the number of bytes we need to push, for this page
* specifically */
size_t bytes = min_t(size_t, PAGE_SIZE - offset, size);
/* If we can't splice it, then copy it in, as normal */
if (!sendpage_ok(page[i]))
msg.msg_flags &= ~MSG_SPLICE_PAGES;
/* Set the bvec pointing to the page, with len $bytes */
bvec_set_page(&bvec, page[i], bytes, offset);
/* Set the iter to $size, aka the size of the whole sendpages (!!!) */
iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size);
try_page_again:
lock_sock(sk);
/* Sendmsg with $size size (!!!) */
rv = tcp_sendmsg_locked(sk, &msg, size);
This means we've been sending oversized iov_iters and tcp_sendmsg calls
for a while. This has a been a benign bug because sendpage_ok() always
returned true. With the recent slab allocator changes being slowly
introduced into next (that disallow sendpage on large kmalloc
allocations), we have recently hit out-of-bounds crashes, due to slight
differences in iov_iter behavior between the MSG_SPLICE_PAGES and
"regular" copy paths:
(MSG_SPLICE_PAGES)
skb_splice_from_iter
iov_iter_extract_pages
iov_iter_extract_bvec_pages
uses i->nr_segs to correctly stop in its tracks before OoB'ing everywhere
skb_splice_from_iter gets a "short" read
(!MSG_SPLICE_PAGES)
skb_copy_to_page_nocache copy=iov_iter_count
[...]
copy_from_iter
/* this doesn't help */
if (unlikely(iter->count < len))
len = iter->count;
iterate_bvec
... and we run off the bvecs
Fix this by properly setting the iov_iter's byte count, plus sending the
correct byte count to tcp_sendmsg_locked.
Cc: stable@vger.kernel.org
Fixes: c2ff29e99a76 ("siw: Inline do_tcp_sendpages()")
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202507220801.50a7210-lkp@intel.com
Signed-off-by: Pedro Falcato <pfalcato@suse.de>
---
drivers/infiniband/sw/siw/siw_qp_tx.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/infiniband/sw/siw/siw_qp_tx.c b/drivers/infiniband/sw/siw/siw_qp_tx.c
index 3a08f57d2211..9576a2b766c4 100644
--- a/drivers/infiniband/sw/siw/siw_qp_tx.c
+++ b/drivers/infiniband/sw/siw/siw_qp_tx.c
@@ -340,11 +340,11 @@ static int siw_tcp_sendpages(struct socket *s, struct page **page, int offset,
if (!sendpage_ok(page[i]))
msg.msg_flags &= ~MSG_SPLICE_PAGES;
bvec_set_page(&bvec, page[i], bytes, offset);
- iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size);
+ iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, bytes);
try_page_again:
lock_sock(sk);
- rv = tcp_sendmsg_locked(sk, &msg, size);
+ rv = tcp_sendmsg_locked(sk, &msg, bytes);
release_sock(sk);
if (rv > 0) {
--
2.50.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* RE: [PATCH] RDMA/siw: Fix the sendmsg byte count in siw_tcp_sendpages
2025-07-23 10:41 [PATCH] RDMA/siw: Fix the sendmsg byte count in siw_tcp_sendpages Pedro Falcato
@ 2025-07-23 14:52 ` Bernard Metzler
2025-07-23 15:49 ` Pedro Falcato
0 siblings, 1 reply; 6+ messages in thread
From: Bernard Metzler @ 2025-07-23 14:52 UTC (permalink / raw)
To: Pedro Falcato, Jason Gunthorpe, Leon Romanovsky, Vlastimil Babka
Cc: Jakub Kicinski, David Howells, Tom Talpey,
linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-mm@kvack.org, stable@vger.kernel.org, kernel test robot
> -----Original Message-----
> From: Pedro Falcato <pfalcato@suse.de>
> Sent: Wednesday, 23 July 2025 12:41
> To: Jason Gunthorpe <jgg@ziepe.ca>; Bernard Metzler <BMT@zurich.ibm.com>;
> Leon Romanovsky <leon@kernel.org>; Vlastimil Babka <vbabka@suse.cz>
> Cc: Jakub Kicinski <kuba@kernel.org>; David Howells <dhowells@redhat.com>;
> Tom Talpey <tom@talpey.com>; linux-rdma@vger.kernel.org; linux-
> kernel@vger.kernel.org; linux-mm@kvack.org; Pedro Falcato
> <pfalcato@suse.de>; stable@vger.kernel.org; kernel test robot
> <oliver.sang@intel.com>
> Subject: [EXTERNAL] [PATCH] RDMA/siw: Fix the sendmsg byte count in
> siw_tcp_sendpages
>
> Ever since commit c2ff29e99a76 ("siw: Inline do_tcp_sendpages()"),
> we have been doing this:
>
> static int siw_tcp_sendpages(struct socket *s, struct page **page, int
> offset,
> size_t size)
> [...]
> /* Calculate the number of bytes we need to push, for this page
> * specifically */
> size_t bytes = min_t(size_t, PAGE_SIZE - offset, size);
> /* If we can't splice it, then copy it in, as normal */
> if (!sendpage_ok(page[i]))
> msg.msg_flags &= ~MSG_SPLICE_PAGES;
> /* Set the bvec pointing to the page, with len $bytes */
> bvec_set_page(&bvec, page[i], bytes, offset);
> /* Set the iter to $size, aka the size of the whole sendpages (!!!)
> */
> iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size);
> try_page_again:
> lock_sock(sk);
> /* Sendmsg with $size size (!!!) */
> rv = tcp_sendmsg_locked(sk, &msg, size);
>
> This means we've been sending oversized iov_iters and tcp_sendmsg calls
> for a while. This has a been a benign bug because sendpage_ok() always
> returned true. With the recent slab allocator changes being slowly
> introduced into next (that disallow sendpage on large kmalloc
> allocations), we have recently hit out-of-bounds crashes, due to slight
> differences in iov_iter behavior between the MSG_SPLICE_PAGES and
> "regular" copy paths:
>
> (MSG_SPLICE_PAGES)
> skb_splice_from_iter
> iov_iter_extract_pages
> iov_iter_extract_bvec_pages
> uses i->nr_segs to correctly stop in its tracks before OoB'ing
> everywhere
> skb_splice_from_iter gets a "short" read
>
> (!MSG_SPLICE_PAGES)
> skb_copy_to_page_nocache copy=iov_iter_count
> [...]
> copy_from_iter
> /* this doesn't help */
> if (unlikely(iter->count < len))
> len = iter->count;
> iterate_bvec
> ... and we run off the bvecs
>
> Fix this by properly setting the iov_iter's byte count, plus sending the
> correct byte count to tcp_sendmsg_locked.
>
> Cc: stable@vger.kernel.org
> Fixes: c2ff29e99a76 ("siw: Inline do_tcp_sendpages()")
> Reported-by: kernel test robot <oliver.sang@intel.com>
> Closes: https%
> 3A__lore.kernel.org_oe-2Dlkp_202507220801.50a7210-2Dlkp-
> 40intel.com&d=DwIDAg&c=BSDicqBQBDjDI9RkVyTcHQ&r=4ynb4Sj_4MUcZXbhvovE4tYSbqx
> yOwdSiLedP4yO55g&m=8FIDji_gvHEZEUL08IM8h6Slg9f_hp4n8OxkdR_OWLnZ9CkPknHCzHVC
> BYkCKt_q&s=IP0c71OgDL-cEe5hFduy0qNhy5WICEkTBTQLrVicotc&e=
> Signed-off-by: Pedro Falcato <pfalcato@suse.de>
> ---
> drivers/infiniband/sw/siw/siw_qp_tx.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/infiniband/sw/siw/siw_qp_tx.c
> b/drivers/infiniband/sw/siw/siw_qp_tx.c
> index 3a08f57d2211..9576a2b766c4 100644
> --- a/drivers/infiniband/sw/siw/siw_qp_tx.c
> +++ b/drivers/infiniband/sw/siw/siw_qp_tx.c
> @@ -340,11 +340,11 @@ static int siw_tcp_sendpages(struct socket *s, struct
> page **page, int offset,
> if (!sendpage_ok(page[i]))
> msg.msg_flags &= ~MSG_SPLICE_PAGES;
> bvec_set_page(&bvec, page[i], bytes, offset);
> - iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size);
> + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, bytes);
>
> try_page_again:
> lock_sock(sk);
> - rv = tcp_sendmsg_locked(sk, &msg, size);
> + rv = tcp_sendmsg_locked(sk, &msg, bytes);
> release_sock(sk);
>
Pedro, many thanks for catching this! I completely
missed it during my too sloppy review of that patch.
It's a serious bug which must be fixed asap.
BUT, looking closer, I do not see the offset being taken
into account when retrying a current segment. So,
resend attempts seem to send old data which are already
out. Shouldn't the try_page_again: label be above
bvec_set_page()??
Thanks!
Bernard.
> if (rv > 0) {
> --
> 2.50.1
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] RDMA/siw: Fix the sendmsg byte count in siw_tcp_sendpages
2025-07-23 14:52 ` Bernard Metzler
@ 2025-07-23 15:49 ` Pedro Falcato
2025-07-23 16:49 ` Bernard Metzler
0 siblings, 1 reply; 6+ messages in thread
From: Pedro Falcato @ 2025-07-23 15:49 UTC (permalink / raw)
To: Bernard Metzler
Cc: Jason Gunthorpe, Leon Romanovsky, Vlastimil Babka, Jakub Kicinski,
David Howells, Tom Talpey, linux-rdma@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
stable@vger.kernel.org, kernel test robot
On Wed, Jul 23, 2025 at 02:52:12PM +0000, Bernard Metzler wrote:
>
>
> > -----Original Message-----
> > From: Pedro Falcato <pfalcato@suse.de>
> > Sent: Wednesday, 23 July 2025 12:41
> > To: Jason Gunthorpe <jgg@ziepe.ca>; Bernard Metzler <BMT@zurich.ibm.com>;
> > Leon Romanovsky <leon@kernel.org>; Vlastimil Babka <vbabka@suse.cz>
> > Cc: Jakub Kicinski <kuba@kernel.org>; David Howells <dhowells@redhat.com>;
> > Tom Talpey <tom@talpey.com>; linux-rdma@vger.kernel.org; linux-
> > kernel@vger.kernel.org; linux-mm@kvack.org; Pedro Falcato
> > <pfalcato@suse.de>; stable@vger.kernel.org; kernel test robot
> > <oliver.sang@intel.com>
> [snip]
> > ---
> > drivers/infiniband/sw/siw/siw_qp_tx.c | 4 ++--
> > 1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/infiniband/sw/siw/siw_qp_tx.c
> > b/drivers/infiniband/sw/siw/siw_qp_tx.c
> > index 3a08f57d2211..9576a2b766c4 100644
> > --- a/drivers/infiniband/sw/siw/siw_qp_tx.c
> > +++ b/drivers/infiniband/sw/siw/siw_qp_tx.c
> > @@ -340,11 +340,11 @@ static int siw_tcp_sendpages(struct socket *s, struct
> > page **page, int offset,
> > if (!sendpage_ok(page[i]))
> > msg.msg_flags &= ~MSG_SPLICE_PAGES;
> > bvec_set_page(&bvec, page[i], bytes, offset);
> > - iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size);
> > + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, bytes);
> >
> > try_page_again:
> > lock_sock(sk);
> > - rv = tcp_sendmsg_locked(sk, &msg, size);
> > + rv = tcp_sendmsg_locked(sk, &msg, bytes);
> > release_sock(sk);
> >
>
> Pedro, many thanks for catching this! I completely
> missed it during my too sloppy review of that patch.
> It's a serious bug which must be fixed asap.
> BUT, looking closer, I do not see the offset being taken
> into account when retrying a current segment. So,
> resend attempts seem to send old data which are already
> out. Shouldn't the try_page_again: label be above
> bvec_set_page()??
This was raised off-list by Vlastimil - I think it's harmless to bump (but not use)
the offset here, because by reusing the iov_iter we progressively consume the data
(it keeps its own size and offset tracking internally). So the only thing we
need to track is the size we pass to tcp_sendmsg_locked[1].
If desired (and if my logic is correct!) I can send a v2 deleting that bit.
[1] Assuming tcp_sendmsg_locked guarantees it will never consume something out
of the iovec_iter without reporting it as bytes copied, which from a code reading
it seems like it won't...
--
Pedro
^ permalink raw reply [flat|nested] 6+ messages in thread
* RE: [PATCH] RDMA/siw: Fix the sendmsg byte count in siw_tcp_sendpages
2025-07-23 15:49 ` Pedro Falcato
@ 2025-07-23 16:49 ` Bernard Metzler
2025-07-25 16:09 ` Pedro Falcato
0 siblings, 1 reply; 6+ messages in thread
From: Bernard Metzler @ 2025-07-23 16:49 UTC (permalink / raw)
To: Pedro Falcato
Cc: Jason Gunthorpe, Leon Romanovsky, Vlastimil Babka, Jakub Kicinski,
David Howells, Tom Talpey, linux-rdma@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
stable@vger.kernel.org, kernel test robot
> -----Original Message-----
> From: Pedro Falcato <pfalcato@suse.de>
> Sent: Wednesday, 23 July 2025 17:49
> To: Bernard Metzler <BMT@zurich.ibm.com>
> Cc: Jason Gunthorpe <jgg@ziepe.ca>; Leon Romanovsky <leon@kernel.org>;
> Vlastimil Babka <vbabka@suse.cz>; Jakub Kicinski <kuba@kernel.org>; David
> Howells <dhowells@redhat.com>; Tom Talpey <tom@talpey.com>; linux-
> rdma@vger.kernel.org; linux-kernel@vger.kernel.org; linux-mm@kvack.org;
> stable@vger.kernel.org; kernel test robot <oliver.sang@intel.com>
> Subject: [EXTERNAL] Re: [PATCH] RDMA/siw: Fix the sendmsg byte count in
> siw_tcp_sendpages
>
> On Wed, Jul 23, 2025 at 02:52:12PM +0000, Bernard Metzler wrote:
> >
> >
> > > -----Original Message-----
> > > From: Pedro Falcato <pfalcato@suse.de>
> > > Sent: Wednesday, 23 July 2025 12:41
> > > To: Jason Gunthorpe <jgg@ziepe.ca>; Bernard Metzler
> <BMT@zurich.ibm.com>;
> > > Leon Romanovsky <leon@kernel.org>; Vlastimil Babka <vbabka@suse.cz>
> > > Cc: Jakub Kicinski <kuba@kernel.org>; David Howells
> <dhowells@redhat.com>;
> > > Tom Talpey <tom@talpey.com>; linux-rdma@vger.kernel.org; linux-
> > > kernel@vger.kernel.org; linux-mm@kvack.org; Pedro Falcato
> > > <pfalcato@suse.de>; stable@vger.kernel.org; kernel test robot
> > > <oliver.sang@intel.com>
> > [snip]
> > > ---
> > > drivers/infiniband/sw/siw/siw_qp_tx.c | 4 ++--
> > > 1 file changed, 2 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/drivers/infiniband/sw/siw/siw_qp_tx.c
> > > b/drivers/infiniband/sw/siw/siw_qp_tx.c
> > > index 3a08f57d2211..9576a2b766c4 100644
> > > --- a/drivers/infiniband/sw/siw/siw_qp_tx.c
> > > +++ b/drivers/infiniband/sw/siw/siw_qp_tx.c
> > > @@ -340,11 +340,11 @@ static int siw_tcp_sendpages(struct socket *s,
> struct
> > > page **page, int offset,
> > > if (!sendpage_ok(page[i]))
> > > msg.msg_flags &= ~MSG_SPLICE_PAGES;
> > > bvec_set_page(&bvec, page[i], bytes, offset);
> > > - iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size);
> > > + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, bytes);
> > >
> > > try_page_again:
> > > lock_sock(sk);
> > > - rv = tcp_sendmsg_locked(sk, &msg, size);
> > > + rv = tcp_sendmsg_locked(sk, &msg, bytes);
> > > release_sock(sk);
> > >
> >
> > Pedro, many thanks for catching this! I completely
> > missed it during my too sloppy review of that patch.
> > It's a serious bug which must be fixed asap.
> > BUT, looking closer, I do not see the offset being taken
> > into account when retrying a current segment. So,
> > resend attempts seem to send old data which are already
> > out. Shouldn't the try_page_again: label be above
> > bvec_set_page()??
>
> This was raised off-list by Vlastimil - I think it's harmless to bump (but
> not use)
> the offset here, because by reusing the iov_iter we progressively consume
> the data
> (it keeps its own size and offset tracking internally). So the only thing
> we
> need to track is the size we pass to tcp_sendmsg_locked[1].
>
Ah okay, I didn't know that. Are we sure? I am currently travelling and have
only limited possibilities to try out things. I just looked up other
use cases and found one in net/tls/tls_main.c#L197. Here the loop looks
very similar, but it works as I was suggesting (taking offset into account
and re-initializing new bvec in case of partial send).
> If desired (and if my logic is correct!) I can send a v2 deleting that bit.
>
So yes if that's all save, please. We shall not have dead code.
Thanks!
Bernard.
>
> [1] Assuming tcp_sendmsg_locked guarantees it will never consume something
> out
> of the iovec_iter without reporting it as bytes copied, which from a code
> reading
> it seems like it won't...
>
>
> --
> Pedro
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] RDMA/siw: Fix the sendmsg byte count in siw_tcp_sendpages
2025-07-23 16:49 ` Bernard Metzler
@ 2025-07-25 16:09 ` Pedro Falcato
2025-07-28 11:34 ` Bernard Metzler
0 siblings, 1 reply; 6+ messages in thread
From: Pedro Falcato @ 2025-07-25 16:09 UTC (permalink / raw)
To: Bernard Metzler
Cc: Jason Gunthorpe, Leon Romanovsky, Vlastimil Babka, Jakub Kicinski,
David Howells, Tom Talpey, linux-rdma@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
stable@vger.kernel.org, kernel test robot
On Wed, Jul 23, 2025 at 04:49:30PM +0000, Bernard Metzler wrote:
>
>
> > -----Original Message-----
> > From: Pedro Falcato <pfalcato@suse.de>
> > Sent: Wednesday, 23 July 2025 17:49
> > To: Bernard Metzler <BMT@zurich.ibm.com>
> > Cc: Jason Gunthorpe <jgg@ziepe.ca>; Leon Romanovsky <leon@kernel.org>;
> > Vlastimil Babka <vbabka@suse.cz>; Jakub Kicinski <kuba@kernel.org>; David
> > Howells <dhowells@redhat.com>; Tom Talpey <tom@talpey.com>; linux-
> > rdma@vger.kernel.org; linux-kernel@vger.kernel.org; linux-mm@kvack.org;
> > stable@vger.kernel.org; kernel test robot <oliver.sang@intel.com>
> > Subject: [EXTERNAL] Re: [PATCH] RDMA/siw: Fix the sendmsg byte count in
> > siw_tcp_sendpages
> >
> > On Wed, Jul 23, 2025 at 02:52:12PM +0000, Bernard Metzler wrote:
> > >
> > >
> > > > -----Original Message-----
> > > > From: Pedro Falcato <pfalcato@suse.de>
> > > > Sent: Wednesday, 23 July 2025 12:41
> > > > To: Jason Gunthorpe <jgg@ziepe.ca>; Bernard Metzler
> > <BMT@zurich.ibm.com>;
> > > > Leon Romanovsky <leon@kernel.org>; Vlastimil Babka <vbabka@suse.cz>
> > > > Cc: Jakub Kicinski <kuba@kernel.org>; David Howells
> > <dhowells@redhat.com>;
> > > > Tom Talpey <tom@talpey.com>; linux-rdma@vger.kernel.org; linux-
> > > > kernel@vger.kernel.org; linux-mm@kvack.org; Pedro Falcato
> > > > <pfalcato@suse.de>; stable@vger.kernel.org; kernel test robot
> > > > <oliver.sang@intel.com>
> > > [snip]
> > > > ---
> > > > drivers/infiniband/sw/siw/siw_qp_tx.c | 4 ++--
> > > > 1 file changed, 2 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/drivers/infiniband/sw/siw/siw_qp_tx.c
> > > > b/drivers/infiniband/sw/siw/siw_qp_tx.c
> > > > index 3a08f57d2211..9576a2b766c4 100644
> > > > --- a/drivers/infiniband/sw/siw/siw_qp_tx.c
> > > > +++ b/drivers/infiniband/sw/siw/siw_qp_tx.c
> > > > @@ -340,11 +340,11 @@ static int siw_tcp_sendpages(struct socket *s,
> > struct
> > > > page **page, int offset,
> > > > if (!sendpage_ok(page[i]))
> > > > msg.msg_flags &= ~MSG_SPLICE_PAGES;
> > > > bvec_set_page(&bvec, page[i], bytes, offset);
> > > > - iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size);
> > > > + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, bytes);
> > > >
> > > > try_page_again:
> > > > lock_sock(sk);
> > > > - rv = tcp_sendmsg_locked(sk, &msg, size);
> > > > + rv = tcp_sendmsg_locked(sk, &msg, bytes);
> > > > release_sock(sk);
> > > >
> > >
> > > Pedro, many thanks for catching this! I completely
> > > missed it during my too sloppy review of that patch.
> > > It's a serious bug which must be fixed asap.
> > > BUT, looking closer, I do not see the offset being taken
> > > into account when retrying a current segment. So,
> > > resend attempts seem to send old data which are already
> > > out. Shouldn't the try_page_again: label be above
> > > bvec_set_page()??
> >
> > This was raised off-list by Vlastimil - I think it's harmless to bump (but
> > not use)
> > the offset here, because by reusing the iov_iter we progressively consume
> > the data
> > (it keeps its own size and offset tracking internally). So the only thing
> > we
> > need to track is the size we pass to tcp_sendmsg_locked[1].
> >
Hi,
Sorry for the delay.
> Ah okay, I didn't know that. Are we sure? I am currently travelling and have
> only limited possibilities to try out things. I just looked up other
I'm not 100% sure, and if some more authoritative voice (David, or Jakub for
the net side) could confirm my analysis, it would be great.
> use cases and found one in net/tls/tls_main.c#L197. Here the loop looks
> very similar, but it works as I was suggesting (taking offset into account
> and re-initializing new bvec in case of partial send).
>
> > If desired (and if my logic is correct!) I can send a v2 deleting that bit.
> >
>
> So yes if that's all save, please. We shall not have dead code.
Understood. I'll send a v2 resetting the bvec and iov_iter if we get no further
feedback in the meanwhile.
>
> Thanks!
> Bernard.
> >
> > [1] Assuming tcp_sendmsg_locked guarantees it will never consume something
> > out
> > of the iovec_iter without reporting it as bytes copied, which from a code
> > reading
> > it seems like it won't...
> >
> >
> > --
> > Pedro
--
Pedro
^ permalink raw reply [flat|nested] 6+ messages in thread
* RE: [PATCH] RDMA/siw: Fix the sendmsg byte count in siw_tcp_sendpages
2025-07-25 16:09 ` Pedro Falcato
@ 2025-07-28 11:34 ` Bernard Metzler
0 siblings, 0 replies; 6+ messages in thread
From: Bernard Metzler @ 2025-07-28 11:34 UTC (permalink / raw)
To: Pedro Falcato, bernard.metzler@linux.dev
Cc: Jason Gunthorpe, Leon Romanovsky, Vlastimil Babka, Jakub Kicinski,
David Howells, Tom Talpey, linux-rdma@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
stable@vger.kernel.org, kernel test robot
> -----Original Message-----
> From: Pedro Falcato <pfalcato@suse.de>
> Sent: Friday, 25 July 2025 18:10
> To: Bernard Metzler <BMT@zurich.ibm.com>
> Cc: Jason Gunthorpe <jgg@ziepe.ca>; Leon Romanovsky <leon@kernel.org>;
> Vlastimil Babka <vbabka@suse.cz>; Jakub Kicinski <kuba@kernel.org>; David
> Howells <dhowells@redhat.com>; Tom Talpey <tom@talpey.com>; linux-
> rdma@vger.kernel.org; linux-kernel@vger.kernel.org; linux-mm@kvack.org;
> stable@vger.kernel.org; kernel test robot <oliver.sang@intel.com>
> Subject: [EXTERNAL] Re: [PATCH] RDMA/siw: Fix the sendmsg byte count in
> siw_tcp_sendpages
>
> On Wed, Jul 23, 2025 at 04:49:30PM +0000, Bernard Metzler wrote:
> >
> >
> > > -----Original Message-----
> > > From: Pedro Falcato <pfalcato@suse.de>
> > > Sent: Wednesday, 23 July 2025 17:49
> > > To: Bernard Metzler <BMT@zurich.ibm.com>
> > > Cc: Jason Gunthorpe <jgg@ziepe.ca>; Leon Romanovsky <leon@kernel.org>;
> > > Vlastimil Babka <vbabka@suse.cz>; Jakub Kicinski <kuba@kernel.org>;
> David
> > > Howells <dhowells@redhat.com>; Tom Talpey <tom@talpey.com>; linux-
> > > rdma@vger.kernel.org; linux-kernel@vger.kernel.org; linux-mm@kvack.org;
> > > stable@vger.kernel.org; kernel test robot <oliver.sang@intel.com>
> > > Subject: [EXTERNAL] Re: [PATCH] RDMA/siw: Fix the sendmsg byte count in
> > > siw_tcp_sendpages
> > >
> > > On Wed, Jul 23, 2025 at 02:52:12PM +0000, Bernard Metzler wrote:
> > > >
> > > >
> > > > > -----Original Message-----
> > > > > From: Pedro Falcato <pfalcato@suse.de>
> > > > > Sent: Wednesday, 23 July 2025 12:41
> > > > > To: Jason Gunthorpe <jgg@ziepe.ca>; Bernard Metzler
> > > <BMT@zurich.ibm.com>;
> > > > > Leon Romanovsky <leon@kernel.org>; Vlastimil Babka <vbabka@suse.cz>
> > > > > Cc: Jakub Kicinski <kuba@kernel.org>; David Howells
> > > <dhowells@redhat.com>;
> > > > > Tom Talpey <tom@talpey.com>; linux-rdma@vger.kernel.org; linux-
> > > > > kernel@vger.kernel.org; linux-mm@kvack.org; Pedro Falcato
> > > > > <pfalcato@suse.de>; stable@vger.kernel.org; kernel test robot
> > > > > <oliver.sang@intel.com>
> > > > [snip]
> > > > > ---
> > > > > drivers/infiniband/sw/siw/siw_qp_tx.c | 4 ++--
> > > > > 1 file changed, 2 insertions(+), 2 deletions(-)
> > > > >
> > > > > diff --git a/drivers/infiniband/sw/siw/siw_qp_tx.c
> > > > > b/drivers/infiniband/sw/siw/siw_qp_tx.c
> > > > > index 3a08f57d2211..9576a2b766c4 100644
> > > > > --- a/drivers/infiniband/sw/siw/siw_qp_tx.c
> > > > > +++ b/drivers/infiniband/sw/siw/siw_qp_tx.c
> > > > > @@ -340,11 +340,11 @@ static int siw_tcp_sendpages(struct socket *s,
> > > struct
> > > > > page **page, int offset,
> > > > > if (!sendpage_ok(page[i]))
> > > > > msg.msg_flags &= ~MSG_SPLICE_PAGES;
> > > > > bvec_set_page(&bvec, page[i], bytes, offset);
> > > > > - iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size);
> > > > > + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, bytes);
> > > > >
> > > > > try_page_again:
> > > > > lock_sock(sk);
> > > > > - rv = tcp_sendmsg_locked(sk, &msg, size);
> > > > > + rv = tcp_sendmsg_locked(sk, &msg, bytes);
> > > > > release_sock(sk);
> > > > >
> > > >
> > > > Pedro, many thanks for catching this! I completely
> > > > missed it during my too sloppy review of that patch.
> > > > It's a serious bug which must be fixed asap.
> > > > BUT, looking closer, I do not see the offset being taken
> > > > into account when retrying a current segment. So,
> > > > resend attempts seem to send old data which are already
> > > > out. Shouldn't the try_page_again: label be above
> > > > bvec_set_page()??
> > >
> > > This was raised off-list by Vlastimil - I think it's harmless to bump
> (but
> > > not use)
> > > the offset here, because by reusing the iov_iter we progressively
> consume
> > > the data
> > > (it keeps its own size and offset tracking internally). So the only
> thing
> > > we
> > > need to track is the size we pass to tcp_sendmsg_locked[1].
> > >
>
> Hi,
>
> Sorry for the delay.
>
> > Ah okay, I didn't know that. Are we sure? I am currently travelling and
> have
> > only limited possibilities to try out things. I just looked up other
>
> I'm not 100% sure, and if some more authoritative voice (David, or Jakub for
> the net side) could confirm my analysis, it would be great.
>
> > use cases and found one in net/tls/tls_main.c#L197. Here the loop looks
> > very similar, but it works as I was suggesting (taking offset into account
> > and re-initializing new bvec in case of partial send).
> >
> > > If desired (and if my logic is correct!) I can send a v2 deleting that
> bit.
> > >
> >
> > So yes if that's all save, please. We shall not have dead code.
>
> Understood. I'll send a v2 resetting the bvec and iov_iter if we get no
> further
> feedback in the meanwhile.
Let's see if we get input on it. Even if that is all save, I think
we shall put in a short comment explaining why we do not re-initiate
the bvec.
Please include my new maintainer email address in further iterations,
and stop sending to the old. My IBM address goes away end of month.
Thank you!
Bernard.
>
> >
> > Thanks!
> > Bernard.
> > >
> > > [1] Assuming tcp_sendmsg_locked guarantees it will never consume
> something
> > > out
> > > of the iovec_iter without reporting it as bytes copied, which from a
> code
> > > reading
> > > it seems like it won't...
> > >
> > >
> > > --
> > > Pedro
>
> --
> Pedro
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2025-07-28 11:34 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-23 10:41 [PATCH] RDMA/siw: Fix the sendmsg byte count in siw_tcp_sendpages Pedro Falcato
2025-07-23 14:52 ` Bernard Metzler
2025-07-23 15:49 ` Pedro Falcato
2025-07-23 16:49 ` Bernard Metzler
2025-07-25 16:09 ` Pedro Falcato
2025-07-28 11:34 ` Bernard Metzler
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).