Linux CIFS filesystem development
 help / color / mirror / Atom feed
From: Stefan Metzmacher <metze@samba.org>
To: David Howells <dhowells@redhat.com>
Cc: "linux-cifs@vger.kernel.org" <linux-cifs@vger.kernel.org>,
	netfs@lists.linux.dev,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Steve French <stfrench@microsoft.com>
Subject: Re: [PATCH] cifs: Collapse smbd_recv_*() into smbd_recv() and just use copy_to_iter()
Date: Wed, 25 Jun 2025 16:18:17 +0200	[thread overview]
Message-ID: <15a2d9f7-0945-4bb9-9879-e2a615b8f208@samba.org> (raw)
In-Reply-To: <f448a729-ca2e-40a8-be67-3334f47a3916@samba.org>

Am 24.06.25 um 14:25 schrieb Stefan Metzmacher:
> Hi David,
> 
> this looks very useful! Just a few comments below...
> 
>> Collapse smbd_recv_buf() and smbd_recv_page() into smbd_recv() and just use
>> copy_to_iter() instead of memcpy().
>>
>> Signed-off-by: David Howells <dhowells@redhat.com>
>> cc: Steve French <stfrench@microsoft.com>
>> cc: Tom Talpey <tom@talpey.com>
>> cc: Stefan Metzmacher <metze@samba.org>
>> cc: Paulo Alcantara (Red Hat) <pc@manguebit.com>
>> cc: Matthew Wilcox <willy@infradead.org>
>> cc: linux-cifs@vger.kernel.org
>> cc: netfs@lists.linux.dev
>> cc: linux-fsdevel@vger.kernel.org
>> ---
>>   fs/smb/client/smbdirect.c |  116 +++++++---------------------------------------
>>   1 file changed, 20 insertions(+), 96 deletions(-)
>>
>> diff --git a/fs/smb/client/smbdirect.c b/fs/smb/client/smbdirect.c
>> index 5ae847919da5..dc64c337aae0 100644
>> --- a/fs/smb/client/smbdirect.c
>> +++ b/fs/smb/client/smbdirect.c
>> @@ -1747,35 +1747,39 @@ struct smbd_connection *smbd_get_connection(
>>   }
>>   /*
>> - * Receive data from receive reassembly queue
>> + * Receive data from the transport's receive reassembly queue
>>    * All the incoming data packets are placed in reassembly queue
>> - * buf: the buffer to read data into
>> + * iter: the buffer to read data into
>>    * size: the length of data to read
>>    * return value: actual data read
>> - * Note: this implementation copies the data from reassebmly queue to receive
>> + *
>> + * Note: this implementation copies the data from reassembly queue to receive
>>    * buffers used by upper layer. This is not the optimal code path. A better way
>>    * to do it is to not have upper layer allocate its receive buffers but rather
>>    * borrow the buffer from reassembly queue, and return it after data is
>>    * consumed. But this will require more changes to upper layer code, and also
>>    * need to consider packet boundaries while they still being reassembled.
>>    */
>> -static int smbd_recv_buf(struct smbd_connection *info, char *buf,
>> -        unsigned int size)
>> +int smbd_recv(struct smbd_connection *info, struct msghdr *msg)
>>   {
>>       struct smbdirect_socket *sc = &info->socket;
>>       struct smbd_response *response;
>>       struct smbdirect_data_transfer *data_transfer;
>> +    size_t size = msg->msg_iter.count;
> 
> I think this should be iov_iter_count()?
> 
>>       int to_copy, to_read, data_read, offset;
>>       u32 data_length, remaining_data_length, data_offset;
>>       int rc;
>> +    if (WARN_ON_ONCE(iov_iter_rw(&msg->msg_iter) == WRITE))
>> +        return -EINVAL; /* It's a bug in upper layer to get there */
>> +
>>   again:
>>       /*
>>        * No need to hold the reassembly queue lock all the time as we are
>>        * the only one reading from the front of the queue. The transport
>>        * may add more entries to the back of the queue at the same time
>>        */
>> -    log_read(INFO, "size=%d info->reassembly_data_length=%d\n", size,
>> +    log_read(INFO, "size=%zd info->reassembly_data_length=%d\n", size,
>>           info->reassembly_data_length);
>>       if (info->reassembly_data_length >= size) {
>>           int queue_length;
>> @@ -1811,9 +1815,12 @@ static int smbd_recv_buf(struct smbd_connection *info, char *buf,
>>                * transport layer is added
>>                */
>>               if (response->first_segment && size == 4) {
>> -                unsigned int rfc1002_len =
>> +                unsigned int len =

Please keep the rfc1002_len variable as it's used in the log_read message below
and it should by host byteorder.

I'd propose a diff like this:

@@ -1846,8 +1850,11 @@ static int smbd_recv_buf(struct smbd_connection *info, char *buf,
                         if (response->first_segment && size == 4) {
                                 unsigned int rfc1002_len =
                                         data_length + remaining_data_length;
-                               *((__be32 *)buf) = cpu_to_be32(rfc1002_len);
+                               __be32 rfc1002_hdr = cpu_to_be32(rfc1002_len);
                                 data_read = 4;
+                               if (copy_to_iter(&rfc1002_hdr, sizeof(rfc1002_hdr),
+                                                &msg->msg_iter) != data_read)
+                                       return -EFAULT;
                                 response->first_segment = false;
                                 log_read(INFO, "returning rfc1002 length %d\n",
                                         rfc1002_len);


>>                       data_length + remaining_data_length;
>> -                *((__be32 *)buf) = cpu_to_be32(rfc1002_len);
>> +                __be32 rfc1002_len = cpu_to_be32(len);
>> +                if (copy_to_iter(&rfc1002_len, sizeof(rfc1002_len),
>> +                         &msg->msg_iter) != sizeof(rfc1002_len))
>> +                    return -EFAULT;
>>                   data_read = 4;
>>                   response->first_segment = false;
>>                   log_read(INFO, "returning rfc1002 length %d\n",
>> @@ -1822,10 +1829,9 @@ static int smbd_recv_buf(struct smbd_connection *info, char *buf,
>>               }
>>               to_copy = min_t(int, data_length - offset, to_read);
>> -            memcpy(
>> -                buf + data_read,
>> -                (char *)data_transfer + data_offset + offset,
>> -                to_copy);
>> +            if (copy_to_iter((char *)data_transfer + data_offset + offset,
>> +                     to_copy, &msg->msg_iter) != to_copy)
>> +                return -EFAULT;
>>               /* move on to the next buffer? */
>>               if (to_copy == data_length - offset) {
>> @@ -1870,6 +1876,8 @@ static int smbd_recv_buf(struct smbd_connection *info, char *buf,
>>                data_read, info->reassembly_data_length,
>>                info->first_entry_offset);
>>   read_rfc1002_done:
>> +        /* SMBDirect will read it all or nothing */
>> +        msg->msg_iter.count = 0;
> 
> And this iov_iter_truncate(0);
> 
> While I'm wondering why we had this at all.
> 
> It seems all callers of cifs_read_iter_from_socket()
> don't care and the code path via sock_recvmsg() doesn't
> truncate it just calls copy_to_iter() via this chain:
> ->inet_recvmsg->tcp_recvmsg->skb_copy_datagram_msg->skb_copy_datagram_iter
> ->simple_copy_to_iter->copy_to_iter()
> 
> I think the old code should have called
> iov_iter_advance(rc) instead of msg->msg_iter.count = 0.
> 
> But the new code doesn't need it as copy_to_iter()
> calls iterate_and_advance().
> 
>>           return data_read;
>>       }
>> @@ -1890,90 +1898,6 @@ static int smbd_recv_buf(struct smbd_connection *info, char *buf,
>>       goto again;
>>   }
>> -/*
>> - * Receive a page from receive reassembly queue
>> - * page: the page to read data into
>> - * to_read: the length of data to read
>> - * return value: actual data read
>> - */
>> -static int smbd_recv_page(struct smbd_connection *info,
>> -        struct page *page, unsigned int page_offset,
>> -        unsigned int to_read)
>> -{
>> -    struct smbdirect_socket *sc = &info->socket;
>> -    int ret;
>> -    char *to_address;
>> -    void *page_address;
>> -
>> -    /* make sure we have the page ready for read */
>> -    ret = wait_event_interruptible(
>> -        info->wait_reassembly_queue,
>> -        info->reassembly_data_length >= to_read ||
>> -            sc->status != SMBDIRECT_SOCKET_CONNECTED);
>> -    if (ret)
>> -        return ret;
>> -
>> -    /* now we can read from reassembly queue and not sleep */
>> -    page_address = kmap_atomic(page);
>> -    to_address = (char *) page_address + page_offset;
>> -
>> -    log_read(INFO, "reading from page=%p address=%p to_read=%d\n",
>> -        page, to_address, to_read);
>> -
>> -    ret = smbd_recv_buf(info, to_address, to_read);
>> -    kunmap_atomic(page_address);
>> -
>> -    return ret;
>> -}
>> -
>> -/*
>> - * Receive data from transport
>> - * msg: a msghdr point to the buffer, can be ITER_KVEC or ITER_BVEC
>> - * return: total bytes read, or 0. SMB Direct will not do partial read.
>> - */
>> -int smbd_recv(struct smbd_connection *info, struct msghdr *msg)
>> -{
>> -    char *buf;
>> -    struct page *page;
>> -    unsigned int to_read, page_offset;
>> -    int rc;
>> -
>> -    if (iov_iter_rw(&msg->msg_iter) == WRITE) {
>> -        /* It's a bug in upper layer to get there */
>> -        cifs_dbg(VFS, "Invalid msg iter dir %u\n",
>> -             iov_iter_rw(&msg->msg_iter));
>> -        rc = -EINVAL;
>> -        goto out;
>> -    }
>> -
>> -    switch (iov_iter_type(&msg->msg_iter)) {
>> -    case ITER_KVEC:
>> -        buf = msg->msg_iter.kvec->iov_base;
>> -        to_read = msg->msg_iter.kvec->iov_len;
>> -        rc = smbd_recv_buf(info, buf, to_read);
>> -        break;
>> -
>> -    case ITER_BVEC:
>> -        page = msg->msg_iter.bvec->bv_page;
>> -        page_offset = msg->msg_iter.bvec->bv_offset;
>> -        to_read = msg->msg_iter.bvec->bv_len;
>> -        rc = smbd_recv_page(info, page, page_offset, to_read);
>> -        break;
>> -
>> -    default:
>> -        /* It's a bug in upper layer to get there */
>> -        cifs_dbg(VFS, "Invalid msg type %d\n",
>> -             iov_iter_type(&msg->msg_iter));
>> -        rc = -EINVAL;
>> -    }
> 
> I guess this is actually a real fix as I just saw
> CIFS: VFS: Invalid msg type 4
> in logs while running the cifs/001 test.
> And 4 is ITER_FOLIOQ.
> 
> So there might be something broken when ITER_FOLIOQ was
> introduced, but I wasn't able to find a specific commit.
> Maybe it was also already broken when using
> smb3 encryption over smbdirect, when ITER_XARRAY was still used.
> 
> metze
> 


  parent reply	other threads:[~2025-06-25 14:18 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-23 13:04 [PATCH] cifs: Collapse smbd_recv_*() into smbd_recv() and just use copy_to_iter() David Howells
2025-06-24 12:25 ` Stefan Metzmacher
2025-06-24 14:22   ` David Howells
2025-06-24 16:05     ` Stefan Metzmacher
2025-06-25  8:07     ` Stefan Metzmacher
2025-06-25 10:10       ` Stefan Metzmacher
2025-06-25 11:25         ` David Howells
2025-06-25 11:51           ` Stefan Metzmacher
2025-06-25 12:47     ` David Howells
2025-06-25 14:18   ` Stefan Metzmacher [this message]
2025-06-25 16:00     ` David Howells

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=15a2d9f7-0945-4bb9-9879-e2a615b8f208@samba.org \
    --to=metze@samba.org \
    --cc=dhowells@redhat.com \
    --cc=linux-cifs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=netfs@lists.linux.dev \
    --cc=stfrench@microsoft.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox