All of lore.kernel.org
 help / color / mirror / Atom feed
From: "zhang.guanghui@cestc.cn" <zhang.guanghui@cestc.cn>
To: sagi <sagi@grimberg.me>,  mgurtovoy <mgurtovoy@nvidia.com>,
	 kbusch <kbusch@kernel.org>,  sashal <sashal@kernel.org>,
	 chunguang.xu <chunguang.xu@shopee.com>
Cc: linux-kernel <linux-kernel@vger.kernel.org>,
	 linux-nvme <linux-nvme@lists.infradead.org>,
	 linux-block <linux-block@vger.kernel.org>
Subject: Re: Re: nvme-tcp: fix a possible UAF when failing to send request
Date: Thu, 20 Feb 2025 16:20:49 +0800	[thread overview]
Message-ID: <202502201620484789268@cestc.cn> (raw)
In-Reply-To: aed9dde7-3271-4b97-9632-8380d37505d9@grimberg.me


Hi 
        After testing this patch,  No sending request failure occurred, the issue has not been reproduced yet.  
It may take a long time to test.

best wishes



zhang.guanghui@cestc.cn



 



发件人: Sagi Grimberg



发送时间: 2025-02-17 15:46



收件人: zhang.guanghui@cestc.cn; mgurtovoy; kbusch; sashal; chunguang.xu



抄送: linux-kernel; linux-nvme; linux-block



主题: Re: nvme-tcp: fix a possible UAF when failing to send request【请注意,邮件由sagigrim@gmail.com代发】



 



 



 



On 10/02/2025 9:41, zhang.guanghui@cestc.cn wrote:



> Hello



>



>



>



>      When using the nvme-tcp driver in a storage cluster, the driver may trigger a null pointer causing the host to crash several times.



>



>



>



> By analyzing the vmcore, we know the direct cause is that  the request->mq_hctx was used after free.



>



>



>



>



>



> CPU1                                                                   CPU2



>



>



>



> nvme_tcp_poll                                                          nvme_tcp_try_send  --failed to send reqrest 13



>



>



>



>      nvme_tcp_try_recv                                                      nvme_tcp_fail_request



>



>



>



>          nvme_tcp_recv_skb                                                      nvme_tcp_end_request



>



>



>



>              nvme_tcp_recv_pdu                                                      nvme_complete_rq



>



>



>



>                  nvme_tcp_handle_comp                                                   nvme_retry_req -- request->mq_hctx have been freed, is NULL.



>



>



>



>                      nvme_tcp_process_nvme_cqe



>



>



>



>                          nvme_complete_rq



>



>



>



>                              nvme_end_req



>



>



>



>                                    blk_mq_end_request



>



>



>



>



>



>



>



> when nvme_tcp_try_send failed to send reqrest 13, it maybe be resulted by selinux or other reasons, this is a problem. then  the nvme_tcp_fail_request would execute。



>



>



>



> but the nvme_tcp_recv_pdu may have received the responding pdu and the nvme_tcp_process_nvme_cqe would have completed the request.  request->mq_hctx was used after free.



>



>



>



> the follow patch is to solve it.



 



Zhang, your email client needs fixing - it is impossible to follow your



emails.



 



>



>



>



> can you give  some suggestions?  thanks!



 



The problem is the C2HTerm that the host is unable to handle correctly.



And it also appears that nvme_tcp_poll() does not signal correctly to



blk-mq to stop



calling poll.



 



One thing to do is to handle C2HTerm PDU correctly, and, here is a



possible fix to try for the UAF issue:



--



diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c



index c637ff04a052..0e390e98aaf9 100644



--- a/drivers/nvme/host/tcp.c



+++ b/drivers/nvme/host/tcp.c



@@ -2673,6 +2673,7 @@ static int nvme_tcp_poll(struct blk_mq_hw_ctx



*hctx, struct io_comp_batch *iob)



 {



        struct nvme_tcp_queue *queue = hctx->driver_data;



        struct sock *sk = queue->sock->sk;



+       int ret;



 



        if (!test_bit(NVME_TCP_Q_LIVE, &queue->flags))



                return 0;



@@ -2680,9 +2681,9 @@ static int nvme_tcp_poll(struct blk_mq_hw_ctx



*hctx, struct io_comp_batch *iob)



        set_bit(NVME_TCP_Q_POLLING, &queue->flags);



        if (sk_can_busy_loop(sk) &&



skb_queue_empty_lockless(&sk->sk_receive_queue))



                sk_busy_loop(sk, true);



-       nvme_tcp_try_recv(queue);



+       ret = nvme_tcp_try_recv(queue);



        clear_bit(NVME_TCP_Q_POLLING, &queue->flags);



-       return queue->nr_cqe;



+       return ret < 0 ? ret : queue->nr_cqe;



 }



 



 static int nvme_tcp_get_address(struct nvme_ctrl *ctrl, char *buf, int



size)



--



 



Does this help?



 



 



  reply	other threads:[~2025-02-20  8:21 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-10  7:41 nvme-tcp: fix a possible UAF when failing to send request zhang.guanghui
2025-02-10 10:01 ` Maurizio Lombardi
2025-02-10 10:24   ` Max Gurtovoy
2025-02-10 11:16     ` zhang.guanghui
2025-02-10 11:34       ` Max Gurtovoy
2025-02-10 16:40     ` Maurizio Lombardi
     [not found]       ` <CAAO4dAWdsMjYMp9jdWXd_48aG0mTtVpRONqjJJ1scnc773tHzg@mail.gmail.com>
2025-02-11  8:04         ` zhang.guanghui
2025-02-12  8:11           ` Maurizio Lombardi
2025-02-12  8:23             ` Maurizio Lombardi
2025-02-12  8:52             ` Maurizio Lombardi
2025-02-12  9:47               ` zhang.guanghui
2025-02-12 10:28                 ` Maurizio Lombardi
2025-02-12 11:14                   ` Maurizio Lombardi
2025-02-12 11:47                     ` Maurizio Lombardi
2025-02-12 15:33 ` Maurizio Lombardi
2025-02-12 16:07   ` Maurizio Lombardi
2025-02-13  2:04     ` zhang.guanghui
2025-02-17  7:46 ` Sagi Grimberg
2025-02-20  8:20   ` zhang.guanghui [this message]
2025-03-07 10:10   ` Re: nvme-tcp: fix a possible UAF when failing to send request【请注意,邮件由sagigrim@gmail.com代发】 zhang.guanghui
2025-03-11 10:52     ` Maurizio Lombardi
2025-03-13  1:48       ` zhang.guanghui
2025-03-13  7:51         ` Hannes Reinecke
2025-03-13  8:18           ` zhang.guanghui
     [not found]             ` <2025031316313196627826@cestc.cn>
2025-03-13  9:01               ` Maurizio Lombardi
2025-03-13  8:38           ` zhang.guanghui
2025-03-28  9:24       ` Re: nvme-tcp: fix a possible UAF when failing to send request zhang.guanghui
2025-04-01 12:11         ` Maurizio Lombardi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=202502201620484789268@cestc.cn \
    --to=zhang.guanghui@cestc.cn \
    --cc=chunguang.xu@shopee.com \
    --cc=kbusch@kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=mgurtovoy@nvidia.com \
    --cc=sagi@grimberg.me \
    --cc=sashal@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.