linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] RDMA/siw: Fix the sendmsg byte count in siw_tcp_sendpages
@ 2025-07-23 10:41 Pedro Falcato
  2025-07-23 14:52 ` Bernard Metzler
  0 siblings, 1 reply; 6+ messages in thread
From: Pedro Falcato @ 2025-07-23 10:41 UTC (permalink / raw)
  To: Jason Gunthorpe, Bernard Metzler, Leon Romanovsky,
	Vlastimil Babka
  Cc: Jakub Kicinski, David Howells, Tom Talpey, linux-rdma,
	linux-kernel, linux-mm, Pedro Falcato, stable, kernel test robot

Ever since commit c2ff29e99a76 ("siw: Inline do_tcp_sendpages()"),
we have been doing this:

static int siw_tcp_sendpages(struct socket *s, struct page **page, int offset,
                             size_t size)
[...]
        /* Calculate the number of bytes we need to push, for this page
         * specifically */
        size_t bytes = min_t(size_t, PAGE_SIZE - offset, size);
        /* If we can't splice it, then copy it in, as normal */
        if (!sendpage_ok(page[i]))
                msg.msg_flags &= ~MSG_SPLICE_PAGES;
        /* Set the bvec pointing to the page, with len $bytes */
        bvec_set_page(&bvec, page[i], bytes, offset);
        /* Set the iter to $size, aka the size of the whole sendpages (!!!) */
        iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size);
try_page_again:
        lock_sock(sk);
        /* Sendmsg with $size size (!!!) */
        rv = tcp_sendmsg_locked(sk, &msg, size);

This means we've been sending oversized iov_iters and tcp_sendmsg calls
for a while. This has a been a benign bug because sendpage_ok() always
returned true. With the recent slab allocator changes being slowly
introduced into next (that disallow sendpage on large kmalloc
allocations), we have recently hit out-of-bounds crashes, due to slight
differences in iov_iter behavior between the MSG_SPLICE_PAGES and
"regular" copy paths:

(MSG_SPLICE_PAGES)
skb_splice_from_iter
  iov_iter_extract_pages
    iov_iter_extract_bvec_pages
      uses i->nr_segs to correctly stop in its tracks before OoB'ing everywhere
  skb_splice_from_iter gets a "short" read

(!MSG_SPLICE_PAGES)
skb_copy_to_page_nocache copy=iov_iter_count
 [...]
   copy_from_iter
        /* this doesn't help */
        if (unlikely(iter->count < len))
                len = iter->count;
          iterate_bvec
            ... and we run off the bvecs

Fix this by properly setting the iov_iter's byte count, plus sending the
correct byte count to tcp_sendmsg_locked.

Cc: stable@vger.kernel.org
Fixes: c2ff29e99a76 ("siw: Inline do_tcp_sendpages()")
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202507220801.50a7210-lkp@intel.com
Signed-off-by: Pedro Falcato <pfalcato@suse.de>
---
 drivers/infiniband/sw/siw/siw_qp_tx.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/sw/siw/siw_qp_tx.c b/drivers/infiniband/sw/siw/siw_qp_tx.c
index 3a08f57d2211..9576a2b766c4 100644
--- a/drivers/infiniband/sw/siw/siw_qp_tx.c
+++ b/drivers/infiniband/sw/siw/siw_qp_tx.c
@@ -340,11 +340,11 @@ static int siw_tcp_sendpages(struct socket *s, struct page **page, int offset,
 		if (!sendpage_ok(page[i]))
 			msg.msg_flags &= ~MSG_SPLICE_PAGES;
 		bvec_set_page(&bvec, page[i], bytes, offset);
-		iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size);
+		iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, bytes);
 
 try_page_again:
 		lock_sock(sk);
-		rv = tcp_sendmsg_locked(sk, &msg, size);
+		rv = tcp_sendmsg_locked(sk, &msg, bytes);
 		release_sock(sk);
 
 		if (rv > 0) {
-- 
2.50.1



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* RE:  [PATCH] RDMA/siw: Fix the sendmsg byte count in siw_tcp_sendpages
  2025-07-23 10:41 [PATCH] RDMA/siw: Fix the sendmsg byte count in siw_tcp_sendpages Pedro Falcato
@ 2025-07-23 14:52 ` Bernard Metzler
  2025-07-23 15:49   ` Pedro Falcato
  0 siblings, 1 reply; 6+ messages in thread
From: Bernard Metzler @ 2025-07-23 14:52 UTC (permalink / raw)
  To: Pedro Falcato, Jason Gunthorpe, Leon Romanovsky, Vlastimil Babka
  Cc: Jakub Kicinski, David Howells, Tom Talpey,
	linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, stable@vger.kernel.org, kernel test robot



> -----Original Message-----
> From: Pedro Falcato <pfalcato@suse.de>
> Sent: Wednesday, 23 July 2025 12:41
> To: Jason Gunthorpe <jgg@ziepe.ca>; Bernard Metzler <BMT@zurich.ibm.com>;
> Leon Romanovsky <leon@kernel.org>; Vlastimil Babka <vbabka@suse.cz>
> Cc: Jakub Kicinski <kuba@kernel.org>; David Howells <dhowells@redhat.com>;
> Tom Talpey <tom@talpey.com>; linux-rdma@vger.kernel.org; linux-
> kernel@vger.kernel.org; linux-mm@kvack.org; Pedro Falcato
> <pfalcato@suse.de>; stable@vger.kernel.org; kernel test robot
> <oliver.sang@intel.com>
> Subject: [EXTERNAL] [PATCH] RDMA/siw: Fix the sendmsg byte count in
> siw_tcp_sendpages
> 
> Ever since commit c2ff29e99a76 ("siw: Inline do_tcp_sendpages()"),
> we have been doing this:
> 
> static int siw_tcp_sendpages(struct socket *s, struct page **page, int
> offset,
>                              size_t size)
> [...]
>         /* Calculate the number of bytes we need to push, for this page
>          * specifically */
>         size_t bytes = min_t(size_t, PAGE_SIZE - offset, size);
>         /* If we can't splice it, then copy it in, as normal */
>         if (!sendpage_ok(page[i]))
>                 msg.msg_flags &= ~MSG_SPLICE_PAGES;
>         /* Set the bvec pointing to the page, with len $bytes */
>         bvec_set_page(&bvec, page[i], bytes, offset);
>         /* Set the iter to $size, aka the size of the whole sendpages (!!!)
> */
>         iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size);
> try_page_again:
>         lock_sock(sk);
>         /* Sendmsg with $size size (!!!) */
>         rv = tcp_sendmsg_locked(sk, &msg, size);
> 
> This means we've been sending oversized iov_iters and tcp_sendmsg calls
> for a while. This has a been a benign bug because sendpage_ok() always
> returned true. With the recent slab allocator changes being slowly
> introduced into next (that disallow sendpage on large kmalloc
> allocations), we have recently hit out-of-bounds crashes, due to slight
> differences in iov_iter behavior between the MSG_SPLICE_PAGES and
> "regular" copy paths:
> 
> (MSG_SPLICE_PAGES)
> skb_splice_from_iter
>   iov_iter_extract_pages
>     iov_iter_extract_bvec_pages
>       uses i->nr_segs to correctly stop in its tracks before OoB'ing
> everywhere
>   skb_splice_from_iter gets a "short" read
> 
> (!MSG_SPLICE_PAGES)
> skb_copy_to_page_nocache copy=iov_iter_count
>  [...]
>    copy_from_iter
>         /* this doesn't help */
>         if (unlikely(iter->count < len))
>                 len = iter->count;
>           iterate_bvec
>             ... and we run off the bvecs
> 
> Fix this by properly setting the iov_iter's byte count, plus sending the
> correct byte count to tcp_sendmsg_locked.
> 
> Cc: stable@vger.kernel.org
> Fixes: c2ff29e99a76 ("siw: Inline do_tcp_sendpages()")
> Reported-by: kernel test robot <oliver.sang@intel.com>
> Closes: https% 
> 3A__lore.kernel.org_oe-2Dlkp_202507220801.50a7210-2Dlkp-
> 40intel.com&d=DwIDAg&c=BSDicqBQBDjDI9RkVyTcHQ&r=4ynb4Sj_4MUcZXbhvovE4tYSbqx
> yOwdSiLedP4yO55g&m=8FIDji_gvHEZEUL08IM8h6Slg9f_hp4n8OxkdR_OWLnZ9CkPknHCzHVC
> BYkCKt_q&s=IP0c71OgDL-cEe5hFduy0qNhy5WICEkTBTQLrVicotc&e=
> Signed-off-by: Pedro Falcato <pfalcato@suse.de>
> ---
>  drivers/infiniband/sw/siw/siw_qp_tx.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/infiniband/sw/siw/siw_qp_tx.c
> b/drivers/infiniband/sw/siw/siw_qp_tx.c
> index 3a08f57d2211..9576a2b766c4 100644
> --- a/drivers/infiniband/sw/siw/siw_qp_tx.c
> +++ b/drivers/infiniband/sw/siw/siw_qp_tx.c
> @@ -340,11 +340,11 @@ static int siw_tcp_sendpages(struct socket *s, struct
> page **page, int offset,
>  		if (!sendpage_ok(page[i]))
>  			msg.msg_flags &= ~MSG_SPLICE_PAGES;
>  		bvec_set_page(&bvec, page[i], bytes, offset);
> -		iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size);
> +		iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, bytes);
> 
>  try_page_again:
>  		lock_sock(sk);
> -		rv = tcp_sendmsg_locked(sk, &msg, size);
> +		rv = tcp_sendmsg_locked(sk, &msg, bytes);
>  		release_sock(sk);
> 

Pedro, many thanks for catching this! I completely
missed it during my too sloppy review of that patch.
It's a serious bug which must be fixed asap.
BUT, looking closer, I do not see the offset being taken
into account when retrying a current segment. So,
resend attempts seem to send old data which are already
out. Shouldn't the try_page_again: label be above
bvec_set_page()??

Thanks!
Bernard.



>  		if (rv > 0) {
> --
> 2.50.1



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] RDMA/siw: Fix the sendmsg byte count in siw_tcp_sendpages
  2025-07-23 14:52 ` Bernard Metzler
@ 2025-07-23 15:49   ` Pedro Falcato
  2025-07-23 16:49     ` Bernard Metzler
  0 siblings, 1 reply; 6+ messages in thread
From: Pedro Falcato @ 2025-07-23 15:49 UTC (permalink / raw)
  To: Bernard Metzler
  Cc: Jason Gunthorpe, Leon Romanovsky, Vlastimil Babka, Jakub Kicinski,
	David Howells, Tom Talpey, linux-rdma@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	stable@vger.kernel.org, kernel test robot

On Wed, Jul 23, 2025 at 02:52:12PM +0000, Bernard Metzler wrote:
> 
> 
> > -----Original Message-----
> > From: Pedro Falcato <pfalcato@suse.de>
> > Sent: Wednesday, 23 July 2025 12:41
> > To: Jason Gunthorpe <jgg@ziepe.ca>; Bernard Metzler <BMT@zurich.ibm.com>;
> > Leon Romanovsky <leon@kernel.org>; Vlastimil Babka <vbabka@suse.cz>
> > Cc: Jakub Kicinski <kuba@kernel.org>; David Howells <dhowells@redhat.com>;
> > Tom Talpey <tom@talpey.com>; linux-rdma@vger.kernel.org; linux-
> > kernel@vger.kernel.org; linux-mm@kvack.org; Pedro Falcato
> > <pfalcato@suse.de>; stable@vger.kernel.org; kernel test robot
> > <oliver.sang@intel.com>
> [snip]
> > ---
> >  drivers/infiniband/sw/siw/siw_qp_tx.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/infiniband/sw/siw/siw_qp_tx.c
> > b/drivers/infiniband/sw/siw/siw_qp_tx.c
> > index 3a08f57d2211..9576a2b766c4 100644
> > --- a/drivers/infiniband/sw/siw/siw_qp_tx.c
> > +++ b/drivers/infiniband/sw/siw/siw_qp_tx.c
> > @@ -340,11 +340,11 @@ static int siw_tcp_sendpages(struct socket *s, struct
> > page **page, int offset,
> >  		if (!sendpage_ok(page[i]))
> >  			msg.msg_flags &= ~MSG_SPLICE_PAGES;
> >  		bvec_set_page(&bvec, page[i], bytes, offset);
> > -		iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size);
> > +		iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, bytes);
> > 
> >  try_page_again:
> >  		lock_sock(sk);
> > -		rv = tcp_sendmsg_locked(sk, &msg, size);
> > +		rv = tcp_sendmsg_locked(sk, &msg, bytes);
> >  		release_sock(sk);
> > 
> 
> Pedro, many thanks for catching this! I completely
> missed it during my too sloppy review of that patch.
> It's a serious bug which must be fixed asap.
> BUT, looking closer, I do not see the offset being taken
> into account when retrying a current segment. So,
> resend attempts seem to send old data which are already
> out. Shouldn't the try_page_again: label be above
> bvec_set_page()??

This was raised off-list by Vlastimil - I think it's harmless to bump (but not use)
the offset here, because by reusing the iov_iter we progressively consume the data
(it keeps its own size and offset tracking internally). So the only thing we
need to track is the size we pass to tcp_sendmsg_locked[1].

If desired (and if my logic is correct!) I can send a v2 deleting that bit.


[1] Assuming tcp_sendmsg_locked guarantees it will never consume something out
of the iovec_iter without reporting it as bytes copied, which from a code reading
it seems like it won't...


-- 
Pedro


^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: [PATCH] RDMA/siw: Fix the sendmsg byte count in siw_tcp_sendpages
  2025-07-23 15:49   ` Pedro Falcato
@ 2025-07-23 16:49     ` Bernard Metzler
  2025-07-25 16:09       ` Pedro Falcato
  0 siblings, 1 reply; 6+ messages in thread
From: Bernard Metzler @ 2025-07-23 16:49 UTC (permalink / raw)
  To: Pedro Falcato
  Cc: Jason Gunthorpe, Leon Romanovsky, Vlastimil Babka, Jakub Kicinski,
	David Howells, Tom Talpey, linux-rdma@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	stable@vger.kernel.org, kernel test robot



> -----Original Message-----
> From: Pedro Falcato <pfalcato@suse.de>
> Sent: Wednesday, 23 July 2025 17:49
> To: Bernard Metzler <BMT@zurich.ibm.com>
> Cc: Jason Gunthorpe <jgg@ziepe.ca>; Leon Romanovsky <leon@kernel.org>;
> Vlastimil Babka <vbabka@suse.cz>; Jakub Kicinski <kuba@kernel.org>; David
> Howells <dhowells@redhat.com>; Tom Talpey <tom@talpey.com>; linux-
> rdma@vger.kernel.org; linux-kernel@vger.kernel.org; linux-mm@kvack.org;
> stable@vger.kernel.org; kernel test robot <oliver.sang@intel.com>
> Subject: [EXTERNAL] Re: [PATCH] RDMA/siw: Fix the sendmsg byte count in
> siw_tcp_sendpages
> 
> On Wed, Jul 23, 2025 at 02:52:12PM +0000, Bernard Metzler wrote:
> >
> >
> > > -----Original Message-----
> > > From: Pedro Falcato <pfalcato@suse.de>
> > > Sent: Wednesday, 23 July 2025 12:41
> > > To: Jason Gunthorpe <jgg@ziepe.ca>; Bernard Metzler
> <BMT@zurich.ibm.com>;
> > > Leon Romanovsky <leon@kernel.org>; Vlastimil Babka <vbabka@suse.cz>
> > > Cc: Jakub Kicinski <kuba@kernel.org>; David Howells
> <dhowells@redhat.com>;
> > > Tom Talpey <tom@talpey.com>; linux-rdma@vger.kernel.org; linux-
> > > kernel@vger.kernel.org; linux-mm@kvack.org; Pedro Falcato
> > > <pfalcato@suse.de>; stable@vger.kernel.org; kernel test robot
> > > <oliver.sang@intel.com>
> > [snip]
> > > ---
> > >  drivers/infiniband/sw/siw/siw_qp_tx.c | 4 ++--
> > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/drivers/infiniband/sw/siw/siw_qp_tx.c
> > > b/drivers/infiniband/sw/siw/siw_qp_tx.c
> > > index 3a08f57d2211..9576a2b766c4 100644
> > > --- a/drivers/infiniband/sw/siw/siw_qp_tx.c
> > > +++ b/drivers/infiniband/sw/siw/siw_qp_tx.c
> > > @@ -340,11 +340,11 @@ static int siw_tcp_sendpages(struct socket *s,
> struct
> > > page **page, int offset,
> > >  		if (!sendpage_ok(page[i]))
> > >  			msg.msg_flags &= ~MSG_SPLICE_PAGES;
> > >  		bvec_set_page(&bvec, page[i], bytes, offset);
> > > -		iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size);
> > > +		iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, bytes);
> > >
> > >  try_page_again:
> > >  		lock_sock(sk);
> > > -		rv = tcp_sendmsg_locked(sk, &msg, size);
> > > +		rv = tcp_sendmsg_locked(sk, &msg, bytes);
> > >  		release_sock(sk);
> > >
> >
> > Pedro, many thanks for catching this! I completely
> > missed it during my too sloppy review of that patch.
> > It's a serious bug which must be fixed asap.
> > BUT, looking closer, I do not see the offset being taken
> > into account when retrying a current segment. So,
> > resend attempts seem to send old data which are already
> > out. Shouldn't the try_page_again: label be above
> > bvec_set_page()??
> 
> This was raised off-list by Vlastimil - I think it's harmless to bump (but
> not use)
> the offset here, because by reusing the iov_iter we progressively consume
> the data
> (it keeps its own size and offset tracking internally). So the only thing
> we
> need to track is the size we pass to tcp_sendmsg_locked[1].
> 
Ah okay, I didn't know that. Are we sure? I am currently travelling and have
only limited possibilities to try out things. I just looked up other
use cases and found one in net/tls/tls_main.c#L197. Here the loop looks
very similar, but it works as I was suggesting (taking offset into account
and re-initializing new bvec in case of partial send).

> If desired (and if my logic is correct!) I can send a v2 deleting that bit.
> 

So yes if that's all save, please. We shall not have dead code.

Thanks!
Bernard.
> 
> [1] Assuming tcp_sendmsg_locked guarantees it will never consume something
> out
> of the iovec_iter without reporting it as bytes copied, which from a code
> reading
> it seems like it won't...
> 
> 
> --
> Pedro

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] RDMA/siw: Fix the sendmsg byte count in siw_tcp_sendpages
  2025-07-23 16:49     ` Bernard Metzler
@ 2025-07-25 16:09       ` Pedro Falcato
  2025-07-28 11:34         ` Bernard Metzler
  0 siblings, 1 reply; 6+ messages in thread
From: Pedro Falcato @ 2025-07-25 16:09 UTC (permalink / raw)
  To: Bernard Metzler
  Cc: Jason Gunthorpe, Leon Romanovsky, Vlastimil Babka, Jakub Kicinski,
	David Howells, Tom Talpey, linux-rdma@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	stable@vger.kernel.org, kernel test robot

On Wed, Jul 23, 2025 at 04:49:30PM +0000, Bernard Metzler wrote:
> 
> 
> > -----Original Message-----
> > From: Pedro Falcato <pfalcato@suse.de>
> > Sent: Wednesday, 23 July 2025 17:49
> > To: Bernard Metzler <BMT@zurich.ibm.com>
> > Cc: Jason Gunthorpe <jgg@ziepe.ca>; Leon Romanovsky <leon@kernel.org>;
> > Vlastimil Babka <vbabka@suse.cz>; Jakub Kicinski <kuba@kernel.org>; David
> > Howells <dhowells@redhat.com>; Tom Talpey <tom@talpey.com>; linux-
> > rdma@vger.kernel.org; linux-kernel@vger.kernel.org; linux-mm@kvack.org;
> > stable@vger.kernel.org; kernel test robot <oliver.sang@intel.com>
> > Subject: [EXTERNAL] Re: [PATCH] RDMA/siw: Fix the sendmsg byte count in
> > siw_tcp_sendpages
> > 
> > On Wed, Jul 23, 2025 at 02:52:12PM +0000, Bernard Metzler wrote:
> > >
> > >
> > > > -----Original Message-----
> > > > From: Pedro Falcato <pfalcato@suse.de>
> > > > Sent: Wednesday, 23 July 2025 12:41
> > > > To: Jason Gunthorpe <jgg@ziepe.ca>; Bernard Metzler
> > <BMT@zurich.ibm.com>;
> > > > Leon Romanovsky <leon@kernel.org>; Vlastimil Babka <vbabka@suse.cz>
> > > > Cc: Jakub Kicinski <kuba@kernel.org>; David Howells
> > <dhowells@redhat.com>;
> > > > Tom Talpey <tom@talpey.com>; linux-rdma@vger.kernel.org; linux-
> > > > kernel@vger.kernel.org; linux-mm@kvack.org; Pedro Falcato
> > > > <pfalcato@suse.de>; stable@vger.kernel.org; kernel test robot
> > > > <oliver.sang@intel.com>
> > > [snip]
> > > > ---
> > > >  drivers/infiniband/sw/siw/siw_qp_tx.c | 4 ++--
> > > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/drivers/infiniband/sw/siw/siw_qp_tx.c
> > > > b/drivers/infiniband/sw/siw/siw_qp_tx.c
> > > > index 3a08f57d2211..9576a2b766c4 100644
> > > > --- a/drivers/infiniband/sw/siw/siw_qp_tx.c
> > > > +++ b/drivers/infiniband/sw/siw/siw_qp_tx.c
> > > > @@ -340,11 +340,11 @@ static int siw_tcp_sendpages(struct socket *s,
> > struct
> > > > page **page, int offset,
> > > >  		if (!sendpage_ok(page[i]))
> > > >  			msg.msg_flags &= ~MSG_SPLICE_PAGES;
> > > >  		bvec_set_page(&bvec, page[i], bytes, offset);
> > > > -		iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size);
> > > > +		iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, bytes);
> > > >
> > > >  try_page_again:
> > > >  		lock_sock(sk);
> > > > -		rv = tcp_sendmsg_locked(sk, &msg, size);
> > > > +		rv = tcp_sendmsg_locked(sk, &msg, bytes);
> > > >  		release_sock(sk);
> > > >
> > >
> > > Pedro, many thanks for catching this! I completely
> > > missed it during my too sloppy review of that patch.
> > > It's a serious bug which must be fixed asap.
> > > BUT, looking closer, I do not see the offset being taken
> > > into account when retrying a current segment. So,
> > > resend attempts seem to send old data which are already
> > > out. Shouldn't the try_page_again: label be above
> > > bvec_set_page()??
> > 
> > This was raised off-list by Vlastimil - I think it's harmless to bump (but
> > not use)
> > the offset here, because by reusing the iov_iter we progressively consume
> > the data
> > (it keeps its own size and offset tracking internally). So the only thing
> > we
> > need to track is the size we pass to tcp_sendmsg_locked[1].
> > 

Hi,

Sorry for the delay.

> Ah okay, I didn't know that. Are we sure? I am currently travelling and have
> only limited possibilities to try out things. I just looked up other

I'm not 100% sure, and if some more authoritative voice (David, or Jakub for
the net side) could confirm my analysis, it would be great.

> use cases and found one in net/tls/tls_main.c#L197. Here the loop looks
> very similar, but it works as I was suggesting (taking offset into account
> and re-initializing new bvec in case of partial send).
> 
> > If desired (and if my logic is correct!) I can send a v2 deleting that bit.
> > 
> 
> So yes if that's all save, please. We shall not have dead code.

Understood. I'll send a v2 resetting the bvec and iov_iter if we get no further
feedback in the meanwhile.

> 
> Thanks!
> Bernard.
> > 
> > [1] Assuming tcp_sendmsg_locked guarantees it will never consume something
> > out
> > of the iovec_iter without reporting it as bytes copied, which from a code
> > reading
> > it seems like it won't...
> > 
> > 
> > --
> > Pedro

-- 
Pedro


^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: [PATCH] RDMA/siw: Fix the sendmsg byte count in siw_tcp_sendpages
  2025-07-25 16:09       ` Pedro Falcato
@ 2025-07-28 11:34         ` Bernard Metzler
  0 siblings, 0 replies; 6+ messages in thread
From: Bernard Metzler @ 2025-07-28 11:34 UTC (permalink / raw)
  To: Pedro Falcato, bernard.metzler@linux.dev
  Cc: Jason Gunthorpe, Leon Romanovsky, Vlastimil Babka, Jakub Kicinski,
	David Howells, Tom Talpey, linux-rdma@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	stable@vger.kernel.org, kernel test robot



> -----Original Message-----
> From: Pedro Falcato <pfalcato@suse.de>
> Sent: Friday, 25 July 2025 18:10
> To: Bernard Metzler <BMT@zurich.ibm.com>
> Cc: Jason Gunthorpe <jgg@ziepe.ca>; Leon Romanovsky <leon@kernel.org>;
> Vlastimil Babka <vbabka@suse.cz>; Jakub Kicinski <kuba@kernel.org>; David
> Howells <dhowells@redhat.com>; Tom Talpey <tom@talpey.com>; linux-
> rdma@vger.kernel.org; linux-kernel@vger.kernel.org; linux-mm@kvack.org;
> stable@vger.kernel.org; kernel test robot <oliver.sang@intel.com>
> Subject: [EXTERNAL] Re: [PATCH] RDMA/siw: Fix the sendmsg byte count in
> siw_tcp_sendpages
> 
> On Wed, Jul 23, 2025 at 04:49:30PM +0000, Bernard Metzler wrote:
> >
> >
> > > -----Original Message-----
> > > From: Pedro Falcato <pfalcato@suse.de>
> > > Sent: Wednesday, 23 July 2025 17:49
> > > To: Bernard Metzler <BMT@zurich.ibm.com>
> > > Cc: Jason Gunthorpe <jgg@ziepe.ca>; Leon Romanovsky <leon@kernel.org>;
> > > Vlastimil Babka <vbabka@suse.cz>; Jakub Kicinski <kuba@kernel.org>;
> David
> > > Howells <dhowells@redhat.com>; Tom Talpey <tom@talpey.com>; linux-
> > > rdma@vger.kernel.org; linux-kernel@vger.kernel.org; linux-mm@kvack.org;
> > > stable@vger.kernel.org; kernel test robot <oliver.sang@intel.com>
> > > Subject: [EXTERNAL] Re: [PATCH] RDMA/siw: Fix the sendmsg byte count in
> > > siw_tcp_sendpages
> > >
> > > On Wed, Jul 23, 2025 at 02:52:12PM +0000, Bernard Metzler wrote:
> > > >
> > > >
> > > > > -----Original Message-----
> > > > > From: Pedro Falcato <pfalcato@suse.de>
> > > > > Sent: Wednesday, 23 July 2025 12:41
> > > > > To: Jason Gunthorpe <jgg@ziepe.ca>; Bernard Metzler
> > > <BMT@zurich.ibm.com>;
> > > > > Leon Romanovsky <leon@kernel.org>; Vlastimil Babka <vbabka@suse.cz>
> > > > > Cc: Jakub Kicinski <kuba@kernel.org>; David Howells
> > > <dhowells@redhat.com>;
> > > > > Tom Talpey <tom@talpey.com>; linux-rdma@vger.kernel.org; linux-
> > > > > kernel@vger.kernel.org; linux-mm@kvack.org; Pedro Falcato
> > > > > <pfalcato@suse.de>; stable@vger.kernel.org; kernel test robot
> > > > > <oliver.sang@intel.com>
> > > > [snip]
> > > > > ---
> > > > >  drivers/infiniband/sw/siw/siw_qp_tx.c | 4 ++--
> > > > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > > > >
> > > > > diff --git a/drivers/infiniband/sw/siw/siw_qp_tx.c
> > > > > b/drivers/infiniband/sw/siw/siw_qp_tx.c
> > > > > index 3a08f57d2211..9576a2b766c4 100644
> > > > > --- a/drivers/infiniband/sw/siw/siw_qp_tx.c
> > > > > +++ b/drivers/infiniband/sw/siw/siw_qp_tx.c
> > > > > @@ -340,11 +340,11 @@ static int siw_tcp_sendpages(struct socket *s,
> > > struct
> > > > > page **page, int offset,
> > > > >  		if (!sendpage_ok(page[i]))
> > > > >  			msg.msg_flags &= ~MSG_SPLICE_PAGES;
> > > > >  		bvec_set_page(&bvec, page[i], bytes, offset);
> > > > > -		iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size);
> > > > > +		iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, bytes);
> > > > >
> > > > >  try_page_again:
> > > > >  		lock_sock(sk);
> > > > > -		rv = tcp_sendmsg_locked(sk, &msg, size);
> > > > > +		rv = tcp_sendmsg_locked(sk, &msg, bytes);
> > > > >  		release_sock(sk);
> > > > >
> > > >
> > > > Pedro, many thanks for catching this! I completely
> > > > missed it during my too sloppy review of that patch.
> > > > It's a serious bug which must be fixed asap.
> > > > BUT, looking closer, I do not see the offset being taken
> > > > into account when retrying a current segment. So,
> > > > resend attempts seem to send old data which are already
> > > > out. Shouldn't the try_page_again: label be above
> > > > bvec_set_page()??
> > >
> > > This was raised off-list by Vlastimil - I think it's harmless to bump
> (but
> > > not use)
> > > the offset here, because by reusing the iov_iter we progressively
> consume
> > > the data
> > > (it keeps its own size and offset tracking internally). So the only
> thing
> > > we
> > > need to track is the size we pass to tcp_sendmsg_locked[1].
> > >
> 
> Hi,
> 
> Sorry for the delay.
> 
> > Ah okay, I didn't know that. Are we sure? I am currently travelling and
> have
> > only limited possibilities to try out things. I just looked up other
> 
> I'm not 100% sure, and if some more authoritative voice (David, or Jakub for
> the net side) could confirm my analysis, it would be great.
> 
> > use cases and found one in net/tls/tls_main.c#L197. Here the loop looks
> > very similar, but it works as I was suggesting (taking offset into account
> > and re-initializing new bvec in case of partial send).
> >
> > > If desired (and if my logic is correct!) I can send a v2 deleting that
> bit.
> > >
> >
> > So yes if that's all save, please. We shall not have dead code.
> 
> Understood. I'll send a v2 resetting the bvec and iov_iter if we get no
> further
> feedback in the meanwhile.

Let's see if we get input on it. Even if that is all save, I think
we shall put in a short comment explaining why we do not re-initiate
the bvec.

Please include my new maintainer email address in further iterations,
and stop sending to the old. My IBM address goes away end of month.
Thank you!

Bernard.
> 
> >
> > Thanks!
> > Bernard.
> > >
> > > [1] Assuming tcp_sendmsg_locked guarantees it will never consume
> something
> > > out
> > > of the iovec_iter without reporting it as bytes copied, which from a
> code
> > > reading
> > > it seems like it won't...
> > >
> > >
> > > --
> > > Pedro
> 
> --
> Pedro

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2025-07-28 11:34 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-23 10:41 [PATCH] RDMA/siw: Fix the sendmsg byte count in siw_tcp_sendpages Pedro Falcato
2025-07-23 14:52 ` Bernard Metzler
2025-07-23 15:49   ` Pedro Falcato
2025-07-23 16:49     ` Bernard Metzler
2025-07-25 16:09       ` Pedro Falcato
2025-07-28 11:34         ` Bernard Metzler

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).