* nvme-tcp: fix a possible UAF when failing to send request
@ 2025-02-10 7:41 zhang.guanghui
2025-02-10 10:01 ` Maurizio Lombardi
` (2 more replies)
0 siblings, 3 replies; 28+ messages in thread
From: zhang.guanghui @ 2025-02-10 7:41 UTC (permalink / raw)
To: sagi, mgurtovoy, kbusch, sashal, chunguang.xu, zhang.guanghui
Cc: linux-kernel, linux-nvme, linux-block
Hello
When using the nvme-tcp driver in a storage cluster, the driver may trigger a null pointer causing the host to crash several times.
By analyzing the vmcore, we know the direct cause is that the request->mq_hctx was used after free.
CPU1 CPU2
nvme_tcp_poll nvme_tcp_try_send --failed to send reqrest 13
nvme_tcp_try_recv nvme_tcp_fail_request
nvme_tcp_recv_skb nvme_tcp_end_request
nvme_tcp_recv_pdu nvme_complete_rq
nvme_tcp_handle_comp nvme_retry_req -- request->mq_hctx have been freed, is NULL.
nvme_tcp_process_nvme_cqe
nvme_complete_rq
nvme_end_req
blk_mq_end_request
when nvme_tcp_try_send failed to send reqrest 13, it maybe be resulted by selinux or other reasons, this is a problem. then the nvme_tcp_fail_request would execute。
but the nvme_tcp_recv_pdu may have received the responding pdu and the nvme_tcp_process_nvme_cqe would have completed the request. request->mq_hctx was used after free.
the follow patch is to solve it.
can you give some suggestions? thanks!
diff --git a/linux/drivers/nvme/host/core.c b/linux/drivers/nvme/host/core.c
index a65b1dce8..417466674 100644
--- a/linux/drivers/nvme/host/core.c
+++ b/linux/drivers/nvme/host/core.c
@@ -288,6 +288,9 @@ static void nvme_retry_req(struct request *req)
unsigned long delay = 0;
u16 crd;
+ if(!req->mq_hctx && req->state == MQ_RQ_IDLE)
+ return;
+
/* The mask and shift result must be <= 3 */
crd = (nvme_req(req)->status & NVME_SC_CRD) >> 11;
The details are as follows
[35665.692310] nvme nvme2: failed to send request -13
[35665.692683] nvme nvme2: queue 1 failed to send request 00000000b42f4e2b state 2 pdu 00000000d7fb8da3 type 4 rq_state 1 nrq_status 0
[35665.693323] nvme nvme2: failed to send rq 00000000f86a68c3 state 2 nrq_status 370
[35665.702265] nvme nvme2: unsupported pdu type (3)
[35665.702272] BUG: kernel NULL pointer dereference, address: 0000000000000000
[35665.702542] nvme nvme2: queue 1 receive failed: -22
[35665.703209] #PF: supervisor write access in kernel mode
[35665.703213] #PF: error_code(0x0002) - not-present page
[35665.703214] PGD 8000003801cce067 P4D 8000003801cce067 PUD 37e6f79067 PMD 0
[35665.703220] Oops: 0002 [#1] SMP PTI
[35665.703658] nvme nvme2: starting error recovery
[35665.704442] CPU: 20 PID: 815 Comm: kworker/20:1H Kdump: loaded Not tainted 5.15.131-17.cl9.x86_64 #1
[35665.705168] nvme nvme2: queue 1 receive again after receive failed
[35665.705809] Hardware name: Inspur aaabbb/YZMB-00882-104, BIOS 4.1.26 09/22/2022
[35665.705812] Workqueue: kblockd blk_mq_requeue_work
[35665.709172] RIP: 0010:_raw_spin_lock+0xc/0x30
[35665.709606] Code: 05 c3 cc cc cc cc 89 c6 e8 31 05 68 ff 66 90 c3 cc cc cc cc 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 31 c0 ba 01 00 00 00 <f0> 0f b1 17 75 05 c3 cc cc cc cc 89 c6 e8 02 05 68 ff 66 90 c3 cc
[35665.710470] RSP: 0000:ffffa67bcd797e08 EFLAGS: 00010246
[35665.710925] RAX: 0000000000000000 RBX: ffff92f6bbcc9840 RCX: ffff92f6bbcc9888
[35665.711393] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000000000000000
[35665.711849] RBP: 0000000000000000 R08: ffffa67bcd797e48 R09: ffff932346d576f4
[35665.712275] R10: 0000000000000008 R11: 0000000000000008 R12: 0000000000000000
[35665.712725] R13: ffff92f6bbcc9888 R14: 0000000000000008 R15: 0000000000000000
[35665.713158] FS: 0000000000000000(0000) GS:ffff93527d400000(0000) knlGS:0000000000000000
[35665.713603] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[35665.714063] CR2: 0000000000000000 CR3: 000000371aa02006 CR4: 00000000007706e0
[35665.714534] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[35665.714961] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[35665.715359] PKRU: 55555554
[35665.715788] Call Trace:
[35665.716201] <TASK>
[35665.716613] ? show_trace_log_lvl+0x1c1/0x2d9
[35665.717049] ? show_trace_log_lvl+0x1c1/0x2d9
[35665.717457] ? blk_mq_request_bypass_insert+0x2c/0xb0
[35665.717950] ? __die_body.cold+0x8/0xd
[35665.718361] ? page_fault_oops+0xac/0x140
[35665.718749] ? blk_mq_start_request+0x30/0xf0
[35665.719144] ? nvme_tcp_queue_rq+0xc7/0x170 [nvme_tcp]
[35665.719547] ? exc_page_fault+0x62/0x130
[35665.719938] ? asm_exc_page_fault+0x22/0x30
[35665.720333] ? _raw_spin_lock+0xc/0x30
[35665.720723] blk_mq_request_bypass_insert+0x2c/0xb0
[35665.721101] blk_mq_requeue_work+0xa5/0x180
[35665.721451] process_one_work+0x1e8/0x390
[35665.721809] worker_thread+0x53/0x3d0
[35665.722159] ? process_one_work+0x390/0x390
[35665.722501] kthread+0x124/0x150
[35665.722849] ? set_kthread_struct+0x50/0x50
[35665.723182] ret_from_fork+0x1f/0x30
[35665.723508] </TASK>
crash> struct nvme_tcp_request ffff92f6bbcc9950
struct nvme_tcp_request {
req = {
cmd = 0xffff92f5b83f6748,
result = {
u16 = 0,
u32 = 0,
u64 = 0
},
genctr = 169 '\251',
retries = 1 '\001',
flags = 0 '\000',
status = 6,
ctrl = 0xffff92f5e5df7348
},
pdu = 0xffff92f5b83f6740,
queue = 0xffff92f407cc9128,
data_len = 4096,
pdu_len = 4096,
pdu_sent = 0,
h2cdata_left = 0,
h2cdata_offset = 0,
ttag = 62,
status = 12,
entry = {
next = 0xdead000000000100,
prev = 0xdead000000000122
},
lentry = {
next = 0x0
},
ddgst = 0,
curr_bio = 0xffff9324d639e240,
iter = {
iter_type = 2 '\002',
nofault = false,
data_source = true,
iov_offset = 0,
count = 4096,
{
iov = 0xffff92f6bbcc98a8,
kvec = 0xffff92f6bbcc98a8,
bvec = 0xffff92f6bbcc98a8,
xarray = 0xffff92f6bbcc98a8,
pipe = 0xffff92f6bbcc98a8
},
{
nr_segs = 1,
{
head = 1,
start_head = 0
},
xarray_start = 1
}
},
offset = 0,
data_sent = 0,
state = NVME_TCP_SEND_DATA
}
crash> nvme_tcp_hdr.type 0xffff92f5b83f6740
type = 4 '\004',
crash>
crash> struct request ffff92f6bbcc9840
struct request {
q = 0xffff92f59d55c240,
mq_ctx = 0xffffc67bb9a1f040,
mq_hctx = 0x0,
cmd_flags = 33556483,
rq_flags = 139456,
tag = 87,
internal_tag = -1,
__data_len = 0,
__sector = 66846720,
bio = 0x0,
biotail = 0x0,
queuelist = {
next = 0xffff92f6bbcc9888,
prev = 0xffff92f6bbcc9888
},
{
hash = {
next = 0x0,
pprev = 0x0
},
ipi_list = {
next = 0x0
}
},
{
rb_node = {
__rb_parent_color = 18446685131795018112,
rb_right = 0x1000,
rb_left = 0x0
},
special_vec = {
bv_page = 0xffffca64841f3180,
bv_len = 4096,
bv_offset = 0
},
completion_data = 0xffffca64841f3180,
error_count = -2078330496
},
{
elv = {
icq = 0x0,
priv = {0xffff92f6bbcc98c8, 0xffff92f6bbcc98c8}
},
flush = {
seq = 0,
list = {
next = 0xffff92f6bbcc98c8,
prev = 0xffff92f6bbcc98c8
},
saved_end_io = 0x0
}
},
rq_disk = 0xffff92f6bdbff600,
part = 0xffff92f59f557800,
start_time_ns = 35665692557229,
io_start_time_ns = 35665692566268,
wbt_flags = 0,
stats_sectors = 0,
nr_phys_segments = 1,
nr_integrity_segments = 0,
write_hint = 0,
ioprio = 0,
state = MQ_RQ_IDLE,
ref = {
refs = {
counter = 0
}
},
timeout = 180000,
deadline = 4330512774,
{
csd = {
node = {
llist = {
next = 0x0
},
{
u_flags = 0,
a_flags = {
counter = 0
}
},
src = 0,
dst = 0
},
func = 0xffffffff8eed62d0 <__blk_mq_complete_request_remote>,
info = 0xffff92f6bbcc9840
},
fifo_time = 0
},
end_io = 0x0,
end_io_data = 0x0
}
Best regards
zhang.guanghui@cestc.cn
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: nvme-tcp: fix a possible UAF when failing to send request
2025-02-10 7:41 nvme-tcp: fix a possible UAF when failing to send request zhang.guanghui
@ 2025-02-10 10:01 ` Maurizio Lombardi
2025-02-10 10:24 ` Max Gurtovoy
2025-02-12 15:33 ` Maurizio Lombardi
2025-02-17 7:46 ` Sagi Grimberg
2 siblings, 1 reply; 28+ messages in thread
From: Maurizio Lombardi @ 2025-02-10 10:01 UTC (permalink / raw)
To: zhang.guanghui@cestc.cn, sagi, mgurtovoy, kbusch, sashal,
chunguang.xu
Cc: linux-kernel, linux-nvme, linux-block
On Mon Feb 10, 2025 at 8:41 AM CET, zhang.guanghui@cestc.cn wrote:
> Hello
>
>
I guess you have to fix your mail client.
>
> When using the nvme-tcp driver in a storage cluster, the driver may trigger a null pointer causing the host to crash several times.
> By analyzing the vmcore, we know the direct cause is that the request->mq_hctx was used after free.
>
>
> CPU1 CPU2
>
> nvme_tcp_poll nvme_tcp_try_send --failed to send reqrest 13
This simply looks like a race condition between nvme_tcp_poll() and nvme_tcp_try_send()
Personally, I would try to fix it inside the nvme-tcp driver without
touching the core functions.
Maybe nvme_tcp_poll should just ensure that io_work completes before
calling nvme_tcp_try_recv(), the POLLING flag should then prevent io_work
from getting rescheduled by the nvme_tcp_data_ready() callback.
Maurizio
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: nvme-tcp: fix a possible UAF when failing to send request
2025-02-10 10:01 ` Maurizio Lombardi
@ 2025-02-10 10:24 ` Max Gurtovoy
2025-02-10 11:16 ` zhang.guanghui
2025-02-10 16:40 ` Maurizio Lombardi
0 siblings, 2 replies; 28+ messages in thread
From: Max Gurtovoy @ 2025-02-10 10:24 UTC (permalink / raw)
To: Maurizio Lombardi, zhang.guanghui@cestc.cn, sagi, kbusch, sashal,
chunguang.xu
Cc: linux-kernel, linux-nvme, linux-block
On 10/02/2025 12:01, Maurizio Lombardi wrote:
> On Mon Feb 10, 2025 at 8:41 AM CET, zhang.guanghui@cestc.cn wrote:
>> Hello
>>
>>
> I guess you have to fix your mail client.
>
>> When using the nvme-tcp driver in a storage cluster, the driver may trigger a null pointer causing the host to crash several times.
>> By analyzing the vmcore, we know the direct cause is that the request->mq_hctx was used after free.
>>
>>
>> CPU1 CPU2
>>
>> nvme_tcp_poll nvme_tcp_try_send --failed to send reqrest 13
> This simply looks like a race condition between nvme_tcp_poll() and nvme_tcp_try_send()
> Personally, I would try to fix it inside the nvme-tcp driver without
> touching the core functions.
>
> Maybe nvme_tcp_poll should just ensure that io_work completes before
> calling nvme_tcp_try_recv(), the POLLING flag should then prevent io_work
> from getting rescheduled by the nvme_tcp_data_ready() callback.
>
>
> Maurizio
It seems to me that the HOST_PATH_ERROR handling can be improved in
nvme-tcp.
In nvme-rdma we use nvme_host_path_error(rq) and nvme_cleanup_cmd(rq) in
case we fail to submit a command..
can you try to replacing nvme_tcp_end_request(blk_mq_rq_from_pdu(req),
NVME_SC_HOST_PATH_ERROR); call with the similar logic we use in
nvme-rdma for host path error handling ?
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Re: nvme-tcp: fix a possible UAF when failing to send request
2025-02-10 10:24 ` Max Gurtovoy
@ 2025-02-10 11:16 ` zhang.guanghui
2025-02-10 11:34 ` Max Gurtovoy
2025-02-10 16:40 ` Maurizio Lombardi
1 sibling, 1 reply; 28+ messages in thread
From: zhang.guanghui @ 2025-02-10 11:16 UTC (permalink / raw)
To: mgurtovoy, Maurizio Lombardi, sagi, kbusch, sashal, chunguang.xu
Cc: linux-kernel, linux-nvme, linux-block
Hi,
Thank you for your reply.
In nvme-rdma we use nvme_host_path_error(rq) , the prerequisites are -EIO, ignore this condition?
in nvme-tcp this judgment condition is different, use the function nvme_host_path_error ,
it should be possible to go to nvme_complete_rq -> nvme_retry_req -- request->mq_hctx have been freed, is NULL.
zhang.guanghui@cestc.cn
From: Max Gurtovoy
Date: 2025-02-10 18:24
To: Maurizio Lombardi; zhang.guanghui@cestc.cn; sagi; kbusch; sashal; chunguang.xu
CC: linux-kernel; linux-nvme; linux-block
Subject: Re: nvme-tcp: fix a possible UAF when failing to send request
On 10/02/2025 12:01, Maurizio Lombardi wrote:
> On Mon Feb 10, 2025 at 8:41 AM CET, zhang.guanghui@cestc.cn wrote:
>> Hello
>>
>>
> I guess you have to fix your mail client.
>
>> When using the nvme-tcp driver in a storage cluster, the driver may trigger a null pointer causing the host to crash several times.
>> By analyzing the vmcore, we know the direct cause is that the request->mq_hctx was used after free.
>>
>>
>> CPU1 CPU2
>>
>> nvme_tcp_poll nvme_tcp_try_send --failed to send reqrest 13
> This simply looks like a race condition between nvme_tcp_poll() and nvme_tcp_try_send()
> Personally, I would try to fix it inside the nvme-tcp driver without
> touching the core functions.
>
> Maybe nvme_tcp_poll should just ensure that io_work completes before
> calling nvme_tcp_try_recv(), the POLLING flag should then prevent io_work
> from getting rescheduled by the nvme_tcp_data_ready() callback.
>
>
> Maurizio
It seems to me that the HOST_PATH_ERROR handling can be improved in
nvme-tcp.
In nvme-rdma we use nvme_host_path_error(rq) and nvme_cleanup_cmd(rq) in
case we fail to submit a command..
can you try to replacing nvme_tcp_end_request(blk_mq_rq_from_pdu(req),
NVME_SC_HOST_PATH_ERROR); call with the similar logic we use in
nvme-rdma for host path error handling ?
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: nvme-tcp: fix a possible UAF when failing to send request
2025-02-10 11:16 ` zhang.guanghui
@ 2025-02-10 11:34 ` Max Gurtovoy
0 siblings, 0 replies; 28+ messages in thread
From: Max Gurtovoy @ 2025-02-10 11:34 UTC (permalink / raw)
To: zhang.guanghui@cestc.cn, Maurizio Lombardi, sagi, kbusch, sashal,
chunguang.xu
Cc: linux-kernel, linux-nvme, linux-block
hi
On 10/02/2025 13:16, zhang.guanghui@cestc.cn wrote:
> Hi,
>
>
>
> Thank you for your reply.
>
>
> In nvme-rdma we use nvme_host_path_error(rq) , the prerequisites are -EIO, ignore this condition?
> in nvme-tcp this judgment condition is different, use the function nvme_host_path_error ,
> it should be possible to go to nvme_complete_rq -> nvme_retry_req -- request->mq_hctx have been freed, is NULL.
did you try the proposal or are you assuming it will be NULL ?
>
>
>
>
> zhang.guanghui@cestc.cn
>
>
>
>
>
>
>
> From: Max Gurtovoy
>
>
>
> Date: 2025-02-10 18:24
>
>
>
> To: Maurizio Lombardi; zhang.guanghui@cestc.cn; sagi; kbusch; sashal; chunguang.xu
>
>
>
> CC: linux-kernel; linux-nvme; linux-block
>
>
>
> Subject: Re: nvme-tcp: fix a possible UAF when failing to send request
>
>
>
>
>
>
>
> On 10/02/2025 12:01, Maurizio Lombardi wrote:
>
>
>
>> On Mon Feb 10, 2025 at 8:41 AM CET, zhang.guanghui@cestc.cn wrote:
>
>
>>> Hello
>
>
>
>
>
>
>> I guess you have to fix your mail client.
>
>
>
>
>>> When using the nvme-tcp driver in a storage cluster, the driver may trigger a null pointer causing the host to crash several times.
>
>
>>> By analyzing the vmcore, we know the direct cause is that the request->mq_hctx was used after free.
>
>
>
>
>
>
>>> CPU1 CPU2
>
>
>
>
>>> nvme_tcp_poll nvme_tcp_try_send --failed to send reqrest 13
>
>
>> This simply looks like a race condition between nvme_tcp_poll() and nvme_tcp_try_send()
>
>
>> Personally, I would try to fix it inside the nvme-tcp driver without
>
>
>> touching the core functions.
>
>
>
>
>> Maybe nvme_tcp_poll should just ensure that io_work completes before
>
>
>> calling nvme_tcp_try_recv(), the POLLING flag should then prevent io_work
>
>
>> from getting rescheduled by the nvme_tcp_data_ready() callback.
>
>
>
>
>
>
>> Maurizio
>
>
>
>
>
>
> It seems to me that the HOST_PATH_ERROR handling can be improved in
>
>
>
> nvme-tcp.
>
>
>
>
>
>
>
> In nvme-rdma we use nvme_host_path_error(rq) and nvme_cleanup_cmd(rq) in
>
>
>
> case we fail to submit a command..
>
>
>
>
>
>
>
> can you try to replacing nvme_tcp_end_request(blk_mq_rq_from_pdu(req),
>
>
>
> NVME_SC_HOST_PATH_ERROR); call with the similar logic we use in
>
>
>
> nvme-rdma for host path error handling ?
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: nvme-tcp: fix a possible UAF when failing to send request
2025-02-10 10:24 ` Max Gurtovoy
2025-02-10 11:16 ` zhang.guanghui
@ 2025-02-10 16:40 ` Maurizio Lombardi
[not found] ` <CAAO4dAWdsMjYMp9jdWXd_48aG0mTtVpRONqjJJ1scnc773tHzg@mail.gmail.com>
1 sibling, 1 reply; 28+ messages in thread
From: Maurizio Lombardi @ 2025-02-10 16:40 UTC (permalink / raw)
To: Max Gurtovoy, zhang.guanghui@cestc.cn, sagi, kbusch, sashal,
chunguang.xu
Cc: linux-kernel, linux-nvme, linux-block
On Mon Feb 10, 2025 at 11:24 AM CET, Max Gurtovoy wrote:
>
> It seems to me that the HOST_PATH_ERROR handling can be improved in
> nvme-tcp.
>
> In nvme-rdma we use nvme_host_path_error(rq) and nvme_cleanup_cmd(rq) in
> case we fail to submit a command..
>
> can you try to replacing nvme_tcp_end_request(blk_mq_rq_from_pdu(req),
> NVME_SC_HOST_PATH_ERROR); call with the similar logic we use in
> nvme-rdma for host path error handling ?
Yes, I could try to prepare a patch.
In any case, I think the main issue here is that nvme_tcp_poll()
should be prevented from racing against io_work... and I also think
there is a possible race condition if nvme_tcp_poll() races against
the controller resetting code.
Maurizio
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Re: nvme-tcp: fix a possible UAF when failing to send request
[not found] ` <CAAO4dAWdsMjYMp9jdWXd_48aG0mTtVpRONqjJJ1scnc773tHzg@mail.gmail.com>
@ 2025-02-11 8:04 ` zhang.guanghui
2025-02-12 8:11 ` Maurizio Lombardi
0 siblings, 1 reply; 28+ messages in thread
From: zhang.guanghui @ 2025-02-11 8:04 UTC (permalink / raw)
To: chunguang.xu, Maurizio Lombardi
Cc: mgurtovoy, sagi, kbusch, sashal, linux-kernel, linux-nvme,
linux-block
Hi
This is a race issue, I can't reproduce it stably yet. I have not tested the latest kernel. but in fact, I've synced some nvme-tcp patches from lastest upstream,
But in fact, the issue still exists. I review nvme-tcp code and found that this issues may exist in nvme_tcp_poll processing.
zhang.guanghui@cestc.cn
From: Xu Chunguang
Date: 2025-02-11 15:20
To: Maurizio Lombardi
CC: Max Gurtovoy; zhang.guanghui; sagi; kbusch; sashal; linux-kernel; linux-nvme; linux-block
Subject: Re: nvme-tcp: fix a possible UAF when failing to send request
Hi:
Does you have tested the latest kernel, can it reproduce the same issue?
Maurizio Lombardi <mlombard@bsdbackstore.eu> 于 2025年2月11日周二 00:40写道:
On Mon Feb 10, 2025 at 11:24 AM CET, Max Gurtovoy wrote:
>
> It seems to me that the HOST_PATH_ERROR handling can be improved in
> nvme-tcp.
>
> In nvme-rdma we use nvme_host_path_error(rq) and nvme_cleanup_cmd(rq) in
> case we fail to submit a command..
>
> can you try to replacing nvme_tcp_end_request(blk_mq_rq_from_pdu(req),
> NVME_SC_HOST_PATH_ERROR); call with the similar logic we use in
> nvme-rdma for host path error handling ?
Yes, I could try to prepare a patch.
In any case, I think the main issue here is that nvme_tcp_poll()
should be prevented from racing against io_work... and I also think
there is a possible race condition if nvme_tcp_poll() races against
the controller resetting code.
Maurizio
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: nvme-tcp: fix a possible UAF when failing to send request
2025-02-11 8:04 ` zhang.guanghui
@ 2025-02-12 8:11 ` Maurizio Lombardi
2025-02-12 8:23 ` Maurizio Lombardi
2025-02-12 8:52 ` Maurizio Lombardi
0 siblings, 2 replies; 28+ messages in thread
From: Maurizio Lombardi @ 2025-02-12 8:11 UTC (permalink / raw)
To: zhang.guanghui@cestc.cn, chunguang.xu
Cc: mgurtovoy, sagi, kbusch, sashal, linux-kernel, linux-nvme,
linux-block
On Tue Feb 11, 2025 at 9:04 AM CET, zhang.guanghui@cestc.cn wrote:
> Hi
>
> This is a race issue, I can't reproduce it stably yet. I have not tested the latest kernel. but in fact, I've synced some nvme-tcp patches from lastest upstream,
Hello, could you try this patch?
queue_lock should protect against concurrent "error recovery",
while send_mutex should serialize try_recv() and try_send(), emulating
the way io_work works.
Concurrent calls to try_recv() should already be protected by
sock_lock.
diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 841238f38fdd..f464de04ff4d 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -2653,16 +2653,24 @@ static int nvme_tcp_poll(struct blk_mq_hw_ctx *hctx, struct io_comp_batch *iob)
{
struct nvme_tcp_queue *queue = hctx->driver_data;
struct sock *sk = queue->sock->sk;
+ int r = 0;
+ mutex_lock(&queue->queue_lock);
if (!test_bit(NVME_TCP_Q_LIVE, &queue->flags))
- return 0;
+ goto out;
set_bit(NVME_TCP_Q_POLLING, &queue->flags);
if (sk_can_busy_loop(sk) && skb_queue_empty_lockless(&sk->sk_receive_queue))
sk_busy_loop(sk, true);
+
+ mutex_lock(&queue->send_mutex);
nvme_tcp_try_recv(queue);
+ r = queue->nr_cqe;
+ mutex_unlock(&queue->send_mutex);
clear_bit(NVME_TCP_Q_POLLING, &queue->flags);
- return queue->nr_cqe;
+out:
+ mutex_unlock(&queue->queue_lock);
+ return r;
}
static int nvme_tcp_get_address(struct nvme_ctrl *ctrl, char *buf, int size)
Thanks,
Maurizio
^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: nvme-tcp: fix a possible UAF when failing to send request
2025-02-12 8:11 ` Maurizio Lombardi
@ 2025-02-12 8:23 ` Maurizio Lombardi
2025-02-12 8:52 ` Maurizio Lombardi
1 sibling, 0 replies; 28+ messages in thread
From: Maurizio Lombardi @ 2025-02-12 8:23 UTC (permalink / raw)
To: Maurizio Lombardi, zhang.guanghui@cestc.cn, chunguang.xu
Cc: mgurtovoy, sagi, kbusch, sashal, linux-kernel, linux-nvme,
linux-block
On Wed Feb 12, 2025 at 9:11 AM CET, Maurizio Lombardi wrote:
> Hello, could you try this patch?
>
> Concurrent calls to try_recv() should already be protected by
> sock_lock.
>
> + mutex_lock(&queue->send_mutex);
> nvme_tcp_try_recv(queue);
> + r = queue->nr_cqe;
> + mutex_unlock(&queue->send_mutex);
Well, reading nr_cqe like this is still racy, but should be a minor
issue and not hard to fix.
Maurizio
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: nvme-tcp: fix a possible UAF when failing to send request
2025-02-12 8:11 ` Maurizio Lombardi
2025-02-12 8:23 ` Maurizio Lombardi
@ 2025-02-12 8:52 ` Maurizio Lombardi
2025-02-12 9:47 ` zhang.guanghui
1 sibling, 1 reply; 28+ messages in thread
From: Maurizio Lombardi @ 2025-02-12 8:52 UTC (permalink / raw)
To: Maurizio Lombardi, zhang.guanghui@cestc.cn, chunguang.xu
Cc: mgurtovoy, sagi, kbusch, sashal, linux-kernel, linux-nvme,
linux-block
On Wed Feb 12, 2025 at 9:11 AM CET, Maurizio Lombardi wrote:
> On Tue Feb 11, 2025 at 9:04 AM CET, zhang.guanghui@cestc.cn wrote:
>> Hi
>>
>> This is a race issue, I can't reproduce it stably yet. I have not tested the latest kernel. but in fact, I've synced some nvme-tcp patches from lastest upstream,
>
> Hello, could you try this patch?
>
> queue_lock should protect against concurrent "error recovery",
> + mutex_lock(&queue->queue_lock);
Unfortunately I've just realized that queue_lock won't save us
from the race against the controller reset, it's still possible
we lock a destroyed mutex. So just try this
simplified patch, I will try to figure out something else:
diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 841238f38fdd..b714e1691c30 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -2660,7 +2660,10 @@ static int nvme_tcp_poll(struct blk_mq_hw_ctx *hctx, struct io_comp_batch *iob)
set_bit(NVME_TCP_Q_POLLING, &queue->flags);
if (sk_can_busy_loop(sk) && skb_queue_empty_lockless(&sk->sk_receive_queue))
sk_busy_loop(sk, true);
+
+ mutex_lock(&queue->send_mutex);
nvme_tcp_try_recv(queue);
+ mutex_unlock(&queue->send_mutex);
clear_bit(NVME_TCP_Q_POLLING, &queue->flags);
return queue->nr_cqe;
}
Maurizio
^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: Re: nvme-tcp: fix a possible UAF when failing to send request
2025-02-12 8:52 ` Maurizio Lombardi
@ 2025-02-12 9:47 ` zhang.guanghui
2025-02-12 10:28 ` Maurizio Lombardi
0 siblings, 1 reply; 28+ messages in thread
From: zhang.guanghui @ 2025-02-12 9:47 UTC (permalink / raw)
To: Maurizio Lombardi, chunguang.xu
Cc: mgurtovoy, sagi, kbusch, sashal, linux-kernel, linux-nvme,
linux-block
Hi, Thanks.
I will test this patch, but I am worried whether it will affect the performance.
Should we also consider null pointer protection?
zhang.guanghui@cestc.cn
From: Maurizio Lombardi
Date: 2025-02-12 16:52
To: Maurizio Lombardi; zhang.guanghui@cestc.cn; chunguang.xu
CC: mgurtovoy; sagi; kbusch; sashal; linux-kernel; linux-nvme; linux-block
Subject: Re: nvme-tcp: fix a possible UAF when failing to send request
On Wed Feb 12, 2025 at 9:11 AM CET, Maurizio Lombardi wrote:
> On Tue Feb 11, 2025 at 9:04 AM CET, zhang.guanghui@cestc.cn wrote:
>> Hi
>>
>> This is a race issue, I can't reproduce it stably yet. I have not tested the latest kernel. but in fact, I've synced some nvme-tcp patches from lastest upstream,
>
> Hello, could you try this patch?
>
> queue_lock should protect against concurrent "error recovery",
> + mutex_lock(&queue->queue_lock);
Unfortunately I've just realized that queue_lock won't save us
from the race against the controller reset, it's still possible
we lock a destroyed mutex. So just try this
simplified patch, I will try to figure out something else:
diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 841238f38fdd..b714e1691c30 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -2660,7 +2660,10 @@ static int nvme_tcp_poll(struct blk_mq_hw_ctx *hctx, struct io_comp_batch *iob)
set_bit(NVME_TCP_Q_POLLING, &queue->flags);
if (sk_can_busy_loop(sk) && skb_queue_empty_lockless(&sk->sk_receive_queue))
sk_busy_loop(sk, true);
+
+ mutex_lock(&queue->send_mutex);
nvme_tcp_try_recv(queue);
+ mutex_unlock(&queue->send_mutex);
clear_bit(NVME_TCP_Q_POLLING, &queue->flags);
return queue->nr_cqe;
}
Maurizio
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: nvme-tcp: fix a possible UAF when failing to send request
2025-02-12 9:47 ` zhang.guanghui
@ 2025-02-12 10:28 ` Maurizio Lombardi
2025-02-12 11:14 ` Maurizio Lombardi
0 siblings, 1 reply; 28+ messages in thread
From: Maurizio Lombardi @ 2025-02-12 10:28 UTC (permalink / raw)
To: zhang.guanghui@cestc.cn, chunguang.xu
Cc: mgurtovoy, sagi, kbusch, sashal, linux-kernel, linux-nvme,
linux-block
On Wed Feb 12, 2025 at 10:47 AM CET, zhang.guanghui@cestc.cn wrote:
> Hi, Thanks.
> I will test this patch, but I am worried whether it will affect the performance.
> Should we also consider null pointer protection?
Yes, it will likely affect the performance, just check if it works.
Probably it could be optimized by just protecting
nvme_tcp_fail_request(), which AFAICT is the only function in the
nvme_tcp_try_send() code that calls nvme_complete_rq().
Maurizio
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: nvme-tcp: fix a possible UAF when failing to send request
2025-02-12 10:28 ` Maurizio Lombardi
@ 2025-02-12 11:14 ` Maurizio Lombardi
2025-02-12 11:47 ` Maurizio Lombardi
0 siblings, 1 reply; 28+ messages in thread
From: Maurizio Lombardi @ 2025-02-12 11:14 UTC (permalink / raw)
To: Maurizio Lombardi, zhang.guanghui@cestc.cn, chunguang.xu
Cc: mgurtovoy, sagi, kbusch, sashal, linux-kernel, linux-nvme,
linux-block
On Wed Feb 12, 2025 at 11:28 AM CET, Maurizio Lombardi wrote:
> On Wed Feb 12, 2025 at 10:47 AM CET, zhang.guanghui@cestc.cn wrote:
>> Hi, Thanks.
>> I will test this patch, but I am worried whether it will affect the performance.
>> Should we also consider null pointer protection?
>
> Yes, it will likely affect the performance, just check if it works.
>
> Probably it could be optimized by just protecting
> nvme_tcp_fail_request(), which AFAICT is the only function in the
> nvme_tcp_try_send() code that calls nvme_complete_rq().
Something like that, maybe, not tested:
diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 841238f38fdd..488edec35a65 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -146,6 +146,7 @@ struct nvme_tcp_queue {
struct mutex queue_lock;
struct mutex send_mutex;
+ struct mutex poll_mutex;
struct llist_head req_list;
struct list_head send_list;
@@ -1259,7 +1260,9 @@ static int nvme_tcp_try_send(struct nvme_tcp_queue *queue)
} else if (ret < 0) {
dev_err(queue->ctrl->ctrl.device,
"failed to send request %d\n", ret);
+ mutex_lock(&queue->poll_mutex);
nvme_tcp_fail_request(queue->request);
+ mutex_unlock(&queue->poll_mutex);
nvme_tcp_done_send_req(queue);
}
out:
@@ -1397,6 +1400,7 @@ static void nvme_tcp_free_queue(struct nvme_ctrl *nctrl, int qid)
kfree(queue->pdu);
mutex_destroy(&queue->send_mutex);
mutex_destroy(&queue->queue_lock);
+ mutex_destroy(&queue->poll_mutex);
}
static int nvme_tcp_init_connection(struct nvme_tcp_queue *queue)
@@ -1710,6 +1714,7 @@ static int nvme_tcp_alloc_queue(struct nvme_ctrl *nctrl, int qid,
init_llist_head(&queue->req_list);
INIT_LIST_HEAD(&queue->send_list);
mutex_init(&queue->send_mutex);
+ mutex_init(&queue->poll_mutex);
INIT_WORK(&queue->io_work, nvme_tcp_io_work);
if (qid > 0)
@@ -2660,7 +2665,9 @@ static int nvme_tcp_poll(struct blk_mq_hw_ctx *hctx, struct io_comp_batch *iob)
set_bit(NVME_TCP_Q_POLLING, &queue->flags);
if (sk_can_busy_loop(sk) && skb_queue_empty_lockless(&sk->sk_receive_queue))
sk_busy_loop(sk, true);
+ mutex_lock(&queue->poll_mutex);
nvme_tcp_try_recv(queue);
+ mutex_unlock(&queue->poll_mutex);
clear_bit(NVME_TCP_Q_POLLING, &queue->flags);
return queue->nr_cqe;
}
^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: nvme-tcp: fix a possible UAF when failing to send request
2025-02-12 11:14 ` Maurizio Lombardi
@ 2025-02-12 11:47 ` Maurizio Lombardi
0 siblings, 0 replies; 28+ messages in thread
From: Maurizio Lombardi @ 2025-02-12 11:47 UTC (permalink / raw)
To: Maurizio Lombardi, zhang.guanghui@cestc.cn, chunguang.xu
Cc: mgurtovoy, sagi, kbusch, sashal, linux-kernel, linux-nvme,
linux-block
On Wed Feb 12, 2025 at 12:14 PM CET, Maurizio Lombardi wrote:
> On Wed Feb 12, 2025 at 11:28 AM CET, Maurizio Lombardi wrote:
>> On Wed Feb 12, 2025 at 10:47 AM CET, zhang.guanghui@cestc.cn wrote:
>>> Hi, Thanks.
>>> I will test this patch, but I am worried whether it will affect the performance.
>>> Should we also consider null pointer protection?
>>
>> Yes, it will likely affect the performance, just check if it works.
>>
>> Probably it could be optimized by just protecting
>> nvme_tcp_fail_request(), which AFAICT is the only function in the
>> nvme_tcp_try_send() code that calls nvme_complete_rq().
>
> Something like that, maybe, not tested:
Ah wait, this won't fix anything because it will end up with a double
completion.
Ok I am not sure how to fix this, someone else maybe has better ideas.
Maurizio
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: nvme-tcp: fix a possible UAF when failing to send request
2025-02-10 7:41 nvme-tcp: fix a possible UAF when failing to send request zhang.guanghui
2025-02-10 10:01 ` Maurizio Lombardi
@ 2025-02-12 15:33 ` Maurizio Lombardi
2025-02-12 16:07 ` Maurizio Lombardi
2025-02-17 7:46 ` Sagi Grimberg
2 siblings, 1 reply; 28+ messages in thread
From: Maurizio Lombardi @ 2025-02-12 15:33 UTC (permalink / raw)
To: zhang.guanghui@cestc.cn, sagi, mgurtovoy, kbusch, sashal,
chunguang.xu
Cc: linux-kernel, linux-nvme, linux-block
On Mon Feb 10, 2025 at 8:41 AM CET, zhang.guanghui@cestc.cn wrote:
> Hello
>
> When using the nvme-tcp driver in a storage cluster, the driver may trigger a null pointer causing the host to crash several times.
> By analyzing the vmcore, we know the direct cause is that the request->mq_hctx was used after free.
>
> CPU1 CPU2
>
> nvme_tcp_poll nvme_tcp_try_send --failed to send reqrest 13
>
> nvme_tcp_try_recv nvme_tcp_fail_request
>
> nvme_tcp_recv_skb nvme_tcp_end_request
>
> nvme_tcp_recv_pdu nvme_complete_rq
>
> nvme_tcp_handle_comp nvme_retry_req -- request->mq_hctx have been freed, is NULL.
> nvme_tcp_process_nvme_cqe
>
> nvme_complete_rq
>
> nvme_end_req
>
> blk_mq_end_request
Taking a step back. Let's take a different approach and try to avoid the
double completion.
The problem here is that apparently we received a nvme_tcp_rsp capsule
from the target, meaning that the command has been processed (I guess
the capsule has an error status?)
So maybe only part of the command has been sent?
Why we receive the rsp capsule at all? Shouldn't this be treated as a fatal
error by the controller?
Maurizio
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: nvme-tcp: fix a possible UAF when failing to send request
2025-02-12 15:33 ` Maurizio Lombardi
@ 2025-02-12 16:07 ` Maurizio Lombardi
2025-02-13 2:04 ` zhang.guanghui
0 siblings, 1 reply; 28+ messages in thread
From: Maurizio Lombardi @ 2025-02-12 16:07 UTC (permalink / raw)
To: Maurizio Lombardi, zhang.guanghui@cestc.cn, sagi, mgurtovoy,
kbusch, sashal, chunguang.xu
Cc: linux-kernel, linux-nvme, linux-block
On Wed Feb 12, 2025 at 4:33 PM CET, Maurizio Lombardi wrote:
>
> Taking a step back. Let's take a different approach and try to avoid the
> double completion.
>
> The problem here is that apparently we received a nvme_tcp_rsp capsule
> from the target, meaning that the command has been processed (I guess
> the capsule has an error status?)
>
> So maybe only part of the command has been sent?
> Why we receive the rsp capsule at all? Shouldn't this be treated as a fatal
> error by the controller?
The NVMe/TCP specification says
******
When a controller detects a fatal error, that controller shall:
1. stop processing any PDUs that arrive on the connection; and
2. send a C2HTermReq PDU
******
And indeed I see in the dmesg this:
nvme nvme2: unsupported pdu type (3)
This means the controller detected the problem and sent to the host the
C2HTermReq command. Upon receiving this command, the host is supposed to
close the connection.
Now I get it.
Zhang, do you have commit aeacfcefa218f4ed11da478e9b7915a37d1afaff in
your kernel, I guess you are missing it. Check it please.
Maurizio
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Re: nvme-tcp: fix a possible UAF when failing to send request
2025-02-12 16:07 ` Maurizio Lombardi
@ 2025-02-13 2:04 ` zhang.guanghui
0 siblings, 0 replies; 28+ messages in thread
From: zhang.guanghui @ 2025-02-13 2:04 UTC (permalink / raw)
To: Maurizio Lombardi, sagi, mgurtovoy, kbusch, sashal, chunguang.xu
Cc: linux-kernel, linux-nvme, linux-block
Hi
int fact, the commit aeacfcefa218f4ed11da478e9b7915a37d1afaff has been synchronized, but the issue still occurs.
I think so, when the host failing to send request -13, maybe only part of the command has been sent or the controller has receved a complete command.
but the controller may send nvme_tcp_rsp and C2HTermReq consecutively. Now I am analyzing the controller spdk processing logic.
when the host uses nvme_tcp_poll, it receives nvme_tcp_rsp and handles it firstly,
then the host have a UAF and crashed.
zhang.guanghui@cestc.cn
From: Maurizio Lombardi
Date: 2025-02-13 00:07
To: Maurizio Lombardi; zhang.guanghui@cestc.cn; sagi; mgurtovoy; kbusch; sashal; chunguang.xu
CC: linux-kernel; linux-nvme; linux-block
Subject: Re: nvme-tcp: fix a possible UAF when failing to send request
On Wed Feb 12, 2025 at 4:33 PM CET, Maurizio Lombardi wrote:
>
> Taking a step back. Let's take a different approach and try to avoid the
> double completion.
>
> The problem here is that apparently we received a nvme_tcp_rsp capsule
> from the target, meaning that the command has been processed (I guess
> the capsule has an error status?)
>
> So maybe only part of the command has been sent?
> Why we receive the rsp capsule at all? Shouldn't this be treated as a fatal
> error by the controller?
--
> The NVMe/TCP specification says
******
When a controller detects a fatal error, that controller shall:
1. stop processing any PDUs that arrive on the connection; and
2. send a C2HTermReq PDU
******
And indeed I see in the dmesg this:
nvme nvme2: unsupported pdu type (3)
This means the controller detected the problem and sent to the host the
C2HTermReq command. Upon receiving this command, the host is supposed to
close the connection.
Now I get it.
Zhang, do you have commit aeacfcefa218f4ed11da478e9b7915a37d1afaff in
your kernel, I guess you are missing it. Check it please.
Maurizio
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: nvme-tcp: fix a possible UAF when failing to send request
2025-02-10 7:41 nvme-tcp: fix a possible UAF when failing to send request zhang.guanghui
2025-02-10 10:01 ` Maurizio Lombardi
2025-02-12 15:33 ` Maurizio Lombardi
@ 2025-02-17 7:46 ` Sagi Grimberg
2025-02-20 8:20 ` zhang.guanghui
2025-03-07 10:10 ` Re: nvme-tcp: fix a possible UAF when failing to send request【请注意,邮件由sagigrim@gmail.com代发】 zhang.guanghui
2 siblings, 2 replies; 28+ messages in thread
From: Sagi Grimberg @ 2025-02-17 7:46 UTC (permalink / raw)
To: zhang.guanghui@cestc.cn, mgurtovoy, kbusch, sashal, chunguang.xu
Cc: linux-kernel, linux-nvme, linux-block
On 10/02/2025 9:41, zhang.guanghui@cestc.cn wrote:
> Hello
>
>
>
> When using the nvme-tcp driver in a storage cluster, the driver may trigger a null pointer causing the host to crash several times.
>
>
>
> By analyzing the vmcore, we know the direct cause is that the request->mq_hctx was used after free.
>
>
>
>
>
> CPU1 CPU2
>
>
>
> nvme_tcp_poll nvme_tcp_try_send --failed to send reqrest 13
>
>
>
> nvme_tcp_try_recv nvme_tcp_fail_request
>
>
>
> nvme_tcp_recv_skb nvme_tcp_end_request
>
>
>
> nvme_tcp_recv_pdu nvme_complete_rq
>
>
>
> nvme_tcp_handle_comp nvme_retry_req -- request->mq_hctx have been freed, is NULL.
>
>
>
> nvme_tcp_process_nvme_cqe
>
>
>
> nvme_complete_rq
>
>
>
> nvme_end_req
>
>
>
> blk_mq_end_request
>
>
>
>
>
>
>
> when nvme_tcp_try_send failed to send reqrest 13, it maybe be resulted by selinux or other reasons, this is a problem. then the nvme_tcp_fail_request would execute。
>
>
>
> but the nvme_tcp_recv_pdu may have received the responding pdu and the nvme_tcp_process_nvme_cqe would have completed the request. request->mq_hctx was used after free.
>
>
>
> the follow patch is to solve it.
Zhang, your email client needs fixing - it is impossible to follow your
emails.
>
>
>
> can you give some suggestions? thanks!
The problem is the C2HTerm that the host is unable to handle correctly.
And it also appears that nvme_tcp_poll() does not signal correctly to
blk-mq to stop
calling poll.
One thing to do is to handle C2HTerm PDU correctly, and, here is a
possible fix to try for the UAF issue:
--
diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index c637ff04a052..0e390e98aaf9 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -2673,6 +2673,7 @@ static int nvme_tcp_poll(struct blk_mq_hw_ctx
*hctx, struct io_comp_batch *iob)
{
struct nvme_tcp_queue *queue = hctx->driver_data;
struct sock *sk = queue->sock->sk;
+ int ret;
if (!test_bit(NVME_TCP_Q_LIVE, &queue->flags))
return 0;
@@ -2680,9 +2681,9 @@ static int nvme_tcp_poll(struct blk_mq_hw_ctx
*hctx, struct io_comp_batch *iob)
set_bit(NVME_TCP_Q_POLLING, &queue->flags);
if (sk_can_busy_loop(sk) &&
skb_queue_empty_lockless(&sk->sk_receive_queue))
sk_busy_loop(sk, true);
- nvme_tcp_try_recv(queue);
+ ret = nvme_tcp_try_recv(queue);
clear_bit(NVME_TCP_Q_POLLING, &queue->flags);
- return queue->nr_cqe;
+ return ret < 0 ? ret : queue->nr_cqe;
}
static int nvme_tcp_get_address(struct nvme_ctrl *ctrl, char *buf, int
size)
--
Does this help?
^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: Re: nvme-tcp: fix a possible UAF when failing to send request
2025-02-17 7:46 ` Sagi Grimberg
@ 2025-02-20 8:20 ` zhang.guanghui
2025-03-07 10:10 ` Re: nvme-tcp: fix a possible UAF when failing to send request【请注意,邮件由sagigrim@gmail.com代发】 zhang.guanghui
1 sibling, 0 replies; 28+ messages in thread
From: zhang.guanghui @ 2025-02-20 8:20 UTC (permalink / raw)
To: sagi, mgurtovoy, kbusch, sashal, chunguang.xu
Cc: linux-kernel, linux-nvme, linux-block
Hi
After testing this patch, No sending request failure occurred, the issue has not been reproduced yet.
It may take a long time to test.
best wishes
zhang.guanghui@cestc.cn
发件人: Sagi Grimberg
发送时间: 2025-02-17 15:46
收件人: zhang.guanghui@cestc.cn; mgurtovoy; kbusch; sashal; chunguang.xu
抄送: linux-kernel; linux-nvme; linux-block
主题: Re: nvme-tcp: fix a possible UAF when failing to send request【请注意,邮件由sagigrim@gmail.com代发】
On 10/02/2025 9:41, zhang.guanghui@cestc.cn wrote:
> Hello
>
>
>
> When using the nvme-tcp driver in a storage cluster, the driver may trigger a null pointer causing the host to crash several times.
>
>
>
> By analyzing the vmcore, we know the direct cause is that the request->mq_hctx was used after free.
>
>
>
>
>
> CPU1 CPU2
>
>
>
> nvme_tcp_poll nvme_tcp_try_send --failed to send reqrest 13
>
>
>
> nvme_tcp_try_recv nvme_tcp_fail_request
>
>
>
> nvme_tcp_recv_skb nvme_tcp_end_request
>
>
>
> nvme_tcp_recv_pdu nvme_complete_rq
>
>
>
> nvme_tcp_handle_comp nvme_retry_req -- request->mq_hctx have been freed, is NULL.
>
>
>
> nvme_tcp_process_nvme_cqe
>
>
>
> nvme_complete_rq
>
>
>
> nvme_end_req
>
>
>
> blk_mq_end_request
>
>
>
>
>
>
>
> when nvme_tcp_try_send failed to send reqrest 13, it maybe be resulted by selinux or other reasons, this is a problem. then the nvme_tcp_fail_request would execute。
>
>
>
> but the nvme_tcp_recv_pdu may have received the responding pdu and the nvme_tcp_process_nvme_cqe would have completed the request. request->mq_hctx was used after free.
>
>
>
> the follow patch is to solve it.
Zhang, your email client needs fixing - it is impossible to follow your
emails.
>
>
>
> can you give some suggestions? thanks!
The problem is the C2HTerm that the host is unable to handle correctly.
And it also appears that nvme_tcp_poll() does not signal correctly to
blk-mq to stop
calling poll.
One thing to do is to handle C2HTerm PDU correctly, and, here is a
possible fix to try for the UAF issue:
--
diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index c637ff04a052..0e390e98aaf9 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -2673,6 +2673,7 @@ static int nvme_tcp_poll(struct blk_mq_hw_ctx
*hctx, struct io_comp_batch *iob)
{
struct nvme_tcp_queue *queue = hctx->driver_data;
struct sock *sk = queue->sock->sk;
+ int ret;
if (!test_bit(NVME_TCP_Q_LIVE, &queue->flags))
return 0;
@@ -2680,9 +2681,9 @@ static int nvme_tcp_poll(struct blk_mq_hw_ctx
*hctx, struct io_comp_batch *iob)
set_bit(NVME_TCP_Q_POLLING, &queue->flags);
if (sk_can_busy_loop(sk) &&
skb_queue_empty_lockless(&sk->sk_receive_queue))
sk_busy_loop(sk, true);
- nvme_tcp_try_recv(queue);
+ ret = nvme_tcp_try_recv(queue);
clear_bit(NVME_TCP_Q_POLLING, &queue->flags);
- return queue->nr_cqe;
+ return ret < 0 ? ret : queue->nr_cqe;
}
static int nvme_tcp_get_address(struct nvme_ctrl *ctrl, char *buf, int
size)
--
Does this help?
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Re: nvme-tcp: fix a possible UAF when failing to send request【请注意,邮件由sagigrim@gmail.com代发】
2025-02-17 7:46 ` Sagi Grimberg
2025-02-20 8:20 ` zhang.guanghui
@ 2025-03-07 10:10 ` zhang.guanghui
2025-03-11 10:52 ` Maurizio Lombardi
1 sibling, 1 reply; 28+ messages in thread
From: zhang.guanghui @ 2025-03-07 10:10 UTC (permalink / raw)
To: sagi, mgurtovoy, kbusch, sashal, chunguang.xu
Cc: linux-kernel, linux-nvme, linux-block
Hi
After testing this patch, sending request failure occurred, unfortunately, the issue still persists.
best wishes
zhang.guanghui@cestc.cn
发件人: Sagi Grimberg
发送时间: 2025-02-17 15:46
收件人: zhang.guanghui@cestc.cn; mgurtovoy; kbusch; sashal; chunguang.xu
抄送: linux-kernel; linux-nvme; linux-block
主题: Re: nvme-tcp: fix a possible UAF when failing to send request【请注意,邮件由sagigrim@gmail.com代发】
On 10/02/2025 9:41, zhang.guanghui@cestc.cn wrote:
> Hello
>
>
>
> When using the nvme-tcp driver in a storage cluster, the driver may trigger a null pointer causing the host to crash several times.
>
>
>
> By analyzing the vmcore, we know the direct cause is that the request->mq_hctx was used after free.
>
>
>
>
>
> CPU1 CPU2
>
>
>
> nvme_tcp_poll nvme_tcp_try_send --failed to send reqrest 13
>
>
>
> nvme_tcp_try_recv nvme_tcp_fail_request
>
>
>
> nvme_tcp_recv_skb nvme_tcp_end_request
>
>
>
> nvme_tcp_recv_pdu nvme_complete_rq
>
>
>
> nvme_tcp_handle_comp nvme_retry_req -- request->mq_hctx have been freed, is NULL.
>
>
>
> nvme_tcp_process_nvme_cqe
>
>
>
> nvme_complete_rq
>
>
>
> nvme_end_req
>
>
>
> blk_mq_end_request
>
>
>
>
>
>
>
> when nvme_tcp_try_send failed to send reqrest 13, it maybe be resulted by selinux or other reasons, this is a problem. then the nvme_tcp_fail_request would execute。
>
>
>
> but the nvme_tcp_recv_pdu may have received the responding pdu and the nvme_tcp_process_nvme_cqe would have completed the request. request->mq_hctx was used after free.
>
>
>
> the follow patch is to solve it.
Zhang, your email client needs fixing - it is impossible to follow your
emails.
>
>
>
> can you give some suggestions? thanks!
The problem is the C2HTerm that the host is unable to handle correctly.
And it also appears that nvme_tcp_poll() does not signal correctly to
blk-mq to stop
calling poll.
One thing to do is to handle C2HTerm PDU correctly, and, here is a
possible fix to try for the UAF issue:
--
diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index c637ff04a052..0e390e98aaf9 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -2673,6 +2673,7 @@ static int nvme_tcp_poll(struct blk_mq_hw_ctx
*hctx, struct io_comp_batch *iob)
{
struct nvme_tcp_queue *queue = hctx->driver_data;
struct sock *sk = queue->sock->sk;
+ int ret;
if (!test_bit(NVME_TCP_Q_LIVE, &queue->flags))
return 0;
@@ -2680,9 +2681,9 @@ static int nvme_tcp_poll(struct blk_mq_hw_ctx
*hctx, struct io_comp_batch *iob)
set_bit(NVME_TCP_Q_POLLING, &queue->flags);
if (sk_can_busy_loop(sk) &&
skb_queue_empty_lockless(&sk->sk_receive_queue))
sk_busy_loop(sk, true);
- nvme_tcp_try_recv(queue);
+ ret = nvme_tcp_try_recv(queue);
clear_bit(NVME_TCP_Q_POLLING, &queue->flags);
- return queue->nr_cqe;
+ return ret < 0 ? ret : queue->nr_cqe;
}
static int nvme_tcp_get_address(struct nvme_ctrl *ctrl, char *buf, int
size)
--
Does this help?
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: nvme-tcp: fix a possible UAF when failing to send request【请注意,邮件由sagigrim@gmail.com代发】
2025-03-07 10:10 ` Re: nvme-tcp: fix a possible UAF when failing to send request【请注意,邮件由sagigrim@gmail.com代发】 zhang.guanghui
@ 2025-03-11 10:52 ` Maurizio Lombardi
2025-03-13 1:48 ` zhang.guanghui
2025-03-28 9:24 ` Re: nvme-tcp: fix a possible UAF when failing to send request zhang.guanghui
0 siblings, 2 replies; 28+ messages in thread
From: Maurizio Lombardi @ 2025-03-11 10:52 UTC (permalink / raw)
To: zhang.guanghui@cestc.cn, sagi, mgurtovoy, kbusch, sashal,
chunguang.xu
Cc: linux-kernel, linux-nvme, linux-block
On Fri Mar 7, 2025 at 11:10 AM CET, zhang.guanghui@cestc.cn wrote:
>
> Hi
>
> After testing this patch, sending request failure occurred, unfortunately, the issue still persists.
Maybe I am completely wrong but I am still quite convinced that the problem here
is that, despite the nvme_tcp_try_send() failure, the target sends a response capsule
for the command, leading to a double-completion in the host.
Sagi, what about taking this patch: https://lore.kernel.org/linux-nvme/20250306160322.1370300-2-mlombard@redhat.com/T/#u
and do a step further by not completing the request, leaving the error
recovery handler the task of cleaning everything up?
diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 327f3f2f5399..72c1d7948386 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -1328,8 +1328,8 @@ static int nvme_tcp_try_send(struct nvme_tcp_queue *queue)
} else if (ret < 0) {
dev_err(queue->ctrl->ctrl.device,
"failed to send request %d\n", ret);
- nvme_tcp_fail_request(queue->request);
nvme_tcp_done_send_req(queue);
+ nvme_tcp_error_recovery(&queue->ctrl->ctrl);
}
out:
memalloc_noreclaim_restore(noreclaim_flag);
Maurizio
^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: Re: nvme-tcp: fix a possible UAF when failing to send request【请注意,邮件由sagigrim@gmail.com代发】
2025-03-11 10:52 ` Maurizio Lombardi
@ 2025-03-13 1:48 ` zhang.guanghui
2025-03-13 7:51 ` Hannes Reinecke
2025-03-28 9:24 ` Re: nvme-tcp: fix a possible UAF when failing to send request zhang.guanghui
1 sibling, 1 reply; 28+ messages in thread
From: zhang.guanghui @ 2025-03-13 1:48 UTC (permalink / raw)
To: Maurizio Lombardi, sagi, mgurtovoy, kbusch, sashal, chunguang.xu
Cc: linux-kernel, linux-nvme, linux-block
Yes, the problem here is that, despite the nvme_tcp_try_send() failure, the target sends a response capsule for the command, leading to a UAF in the host.
Is it more reasonable to disable queue->rd_enabled to prevent receiving. Thanks
diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index be04c5f3856d..17407eb12ad9 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -1203,8 +1203,9 @@ static int nvme_tcp_try_send(struct nvme_tcp_queue *queue)
} else if (ret < 0) {
dev_err(queue->ctrl->ctrl.device,
"failed to send request %d\n", ret);
- nvme_tcp_fail_request(queue->request);
nvme_tcp_done_send_req(queue);
+ queue->rd_enabled = false;
+ nvme_tcp_error_recovery(&queue->ctrl->ctrl);
}
out:
memalloc_noreclaim_restore(noreclaim_flag);
zhang.guanghui@cestc.cn
发件人: Maurizio Lombardi
发送时间: 2025-03-11 18:52
收件人: zhang.guanghui@cestc.cn; sagi; mgurtovoy; kbusch; sashal; chunguang.xu
抄送: linux-kernel; linux-nvme; linux-block
主题: Re: nvme-tcp: fix a possible UAF when failing to send request【请注意,邮件由sagigrim@gmail.com代发】
On Fri Mar 7, 2025 at 11:10 AM CET, zhang.guanghui@cestc.cn wrote:
>
> Hi
>
> After testing this patch, sending request failure occurred, unfortunately, the issue still persists.
Maybe I am completely wrong but I am still quite convinced that the problem here
is that, despite the nvme_tcp_try_send() failure, the target sends a response capsule
for the command, leading to a double-completion in the host.
Sagi, what about taking this patch: https://lore.kernel.org/linux-nvme/20250306160322.1370300-2-mlombard@redhat.com/T/#u
and do a step further by not completing the request, leaving the error
recovery handler the task of cleaning everything up?
diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 327f3f2f5399..72c1d7948386 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -1328,8 +1328,8 @@ static int nvme_tcp_try_send(struct nvme_tcp_queue *queue)
} else if (ret < 0) {
dev_err(queue->ctrl->ctrl.device,
"failed to send request %d\n", ret);
- nvme_tcp_fail_request(queue->request);
nvme_tcp_done_send_req(queue);
+ nvme_tcp_error_recovery(&queue->ctrl->ctrl);
}
out:
memalloc_noreclaim_restore(noreclaim_flag);
Maurizio
^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: nvme-tcp: fix a possible UAF when failing to send request【请注意,邮件由sagigrim@gmail.com代发】
2025-03-13 1:48 ` zhang.guanghui
@ 2025-03-13 7:51 ` Hannes Reinecke
2025-03-13 8:18 ` zhang.guanghui
2025-03-13 8:38 ` zhang.guanghui
0 siblings, 2 replies; 28+ messages in thread
From: Hannes Reinecke @ 2025-03-13 7:51 UTC (permalink / raw)
To: zhang.guanghui@cestc.cn, Maurizio Lombardi, sagi, mgurtovoy,
kbusch, sashal, chunguang.xu
Cc: linux-kernel, linux-nvme, linux-block
On 3/13/25 02:48, zhang.guanghui@cestc.cn wrote:
> Yes, the problem here is that, despite the nvme_tcp_try_send() failure, the target sends a response capsule for the command, leading to a UAF in the host.
>
> Is it more reasonable to disable queue->rd_enabled to prevent receiving. Thanks
>
> diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
> index be04c5f3856d..17407eb12ad9 100644
> --- a/drivers/nvme/host/tcp.c
> +++ b/drivers/nvme/host/tcp.c
> @@ -1203,8 +1203,9 @@ static int nvme_tcp_try_send(struct nvme_tcp_queue *queue)
> } else if (ret < 0) {
> dev_err(queue->ctrl->ctrl.device,
> "failed to send request %d\n", ret);
> - nvme_tcp_fail_request(queue->request);
> nvme_tcp_done_send_req(queue);
> + queue->rd_enabled = false;
> + nvme_tcp_error_recovery(&queue->ctrl->ctrl);
> }
> out:
> memalloc_noreclaim_restore(noreclaim_flag);
>
>
>
Hmm. In principle, yes. Problem is that network is a bi-directional
communication, and a failure on one side doesn't necessarily imply
a failure on the other.
In particular when the send side fails we should _continue_ to read
as we should be flushing the read side buffer before closing.
So I agree with starting error recovery, but not with disabling the
reading side (as we haven't encountered a read error).
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare@suse.de +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Re: nvme-tcp: fix a possible UAF when failing to send request【请注意,邮件由sagigrim@gmail.com代发】
2025-03-13 7:51 ` Hannes Reinecke
@ 2025-03-13 8:18 ` zhang.guanghui
[not found] ` <2025031316313196627826@cestc.cn>
2025-03-13 8:38 ` zhang.guanghui
1 sibling, 1 reply; 28+ messages in thread
From: zhang.guanghui @ 2025-03-13 8:18 UTC (permalink / raw)
To: Hannes Reinecke, Maurizio Lombardi, sagi, mgurtovoy, kbusch,
sashal, chunguang.xu
Cc: linux-kernel, linux-nvme, linux-block
Hi, in fact, the target may send C2HTermReq.
zhang.guanghui@cestc.cn
发件人: Hannes Reinecke
发送时间: 2025-03-13 15:51
收件人: zhang.guanghui@cestc.cn; Maurizio Lombardi; sagi; mgurtovoy; kbusch; sashal; chunguang.xu
抄送: linux-kernel; linux-nvme; linux-block
主题: Re: nvme-tcp: fix a possible UAF when failing to send request【请注意,邮件由sagigrim@gmail.com代发】
On 3/13/25 02:48, zhang.guanghui@cestc.cn wrote:
> Yes, the problem here is that, despite the nvme_tcp_try_send() failure, the target sends a response capsule for the command, leading to a UAF in the host.
>
> Is it more reasonable to disable queue->rd_enabled to prevent receiving. Thanks
>
> diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
> index be04c5f3856d..17407eb12ad9 100644
> --- a/drivers/nvme/host/tcp.c
> +++ b/drivers/nvme/host/tcp.c
> @@ -1203,8 +1203,9 @@ static int nvme_tcp_try_send(struct nvme_tcp_queue *queue)
> } else if (ret < 0) {
> dev_err(queue->ctrl->ctrl.device,
> "failed to send request %d\n", ret);
> - nvme_tcp_fail_request(queue->request);
> nvme_tcp_done_send_req(queue);
> + queue->rd_enabled = false;
> + nvme_tcp_error_recovery(&queue->ctrl->ctrl);
> }
> out:
> memalloc_noreclaim_restore(noreclaim_flag);
>
>
>
Hmm. In principle, yes. Problem is that network is a bi-directional
communication, and a failure on one side doesn't necessarily imply
a failure on the other.
In particular when the send side fails we should _continue_ to read
as we should be flushing the read side buffer before closing.
So I agree with starting error recovery, but not with disabling the
reading side (as we haven't encountered a read error).
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare@suse.de +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Re: nvme-tcp: fix a possible UAF when failing to send request【请注意,邮件由sagigrim@gmail.com代发】
2025-03-13 7:51 ` Hannes Reinecke
2025-03-13 8:18 ` zhang.guanghui
@ 2025-03-13 8:38 ` zhang.guanghui
1 sibling, 0 replies; 28+ messages in thread
From: zhang.guanghui @ 2025-03-13 8:38 UTC (permalink / raw)
To: Hannes Reinecke, Maurizio Lombardi, sagi, mgurtovoy, kbusch,
sashal, chunguang.xu
Cc: linux-kernel, linux-nvme, linux-block
Hi,
in fact, the nvme_tcp_try_send() failure, the target may send C2HTermReq immediately. while the host receives the C2HTermReq and still starting error recovery.
so when queue->rd_enabled is false, can avoid starting error recovery agagin. Thanks
zhang.guanghui@cestc.cn
发件人: Hannes Reinecke
发送时间: 2025-03-13 15:51
收件人: zhang.guanghui@cestc.cn; Maurizio Lombardi; sagi; mgurtovoy; kbusch; sashal; chunguang.xu
抄送: linux-kernel; linux-nvme; linux-block
主题: Re: nvme-tcp: fix a possible UAF when failing to send request【请注意,邮件由sagigrim@gmail.com代发】
On 3/13/25 02:48, zhang.guanghui@cestc.cn wrote:
> Yes, the problem here is that, despite the nvme_tcp_try_send() failure, the target sends a response capsule for the command, leading to a UAF in the host.
>
> Is it more reasonable to disable queue->rd_enabled to prevent receiving. Thanks
>
> diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
> index be04c5f3856d..17407eb12ad9 100644
> --- a/drivers/nvme/host/tcp.c
> +++ b/drivers/nvme/host/tcp.c
> @@ -1203,8 +1203,9 @@ static int nvme_tcp_try_send(struct nvme_tcp_queue *queue)
> } else if (ret < 0) {
> dev_err(queue->ctrl->ctrl.device,
> "failed to send request %d\n", ret);
> - nvme_tcp_fail_request(queue->request);
> nvme_tcp_done_send_req(queue);
> + queue->rd_enabled = false;
> + nvme_tcp_error_recovery(&queue->ctrl->ctrl);
> }
> out:
> memalloc_noreclaim_restore(noreclaim_flag);
>
>
>
Hmm. In principle, yes. Problem is that network is a bi-directional
communication, and a failure on one side doesn't necessarily imply
a failure on the other.
In particular when the send side fails we should _continue_ to read
as we should be flushing the read side buffer before closing.
So I agree with starting error recovery, but not with disabling the
reading side (as we haven't encountered a read error).
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare@suse.de +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: nvme-tcp: fix a possible UAF when failing to send request【请注意,邮件由sagigrim@gmail.com代发】
[not found] ` <2025031316313196627826@cestc.cn>
@ 2025-03-13 9:01 ` Maurizio Lombardi
0 siblings, 0 replies; 28+ messages in thread
From: Maurizio Lombardi @ 2025-03-13 9:01 UTC (permalink / raw)
To: zhang.guanghui@cestc.cn, Hannes Reinecke, sagi, mgurtovoy, kbusch,
sashal, chunguang.xu
Cc: linux-kernel, linux-nvme, linux-block
On Thu Mar 13, 2025 at 9:31 AM CET, zhang.guanghui@cestc.cn wrote:
> Hi,
> in fact, the nvme_tcp_try_send() failure, the target may send C2HTermReq immediately. while the host receives the C2HTermReq and still starting error recovery.
> so when queue->rd_enabled is false, can avoid starting error recovery agagin.
Not all targets send C2HTermReq (for example, the Linux target doesn't
at the moment) so you can't rely on that.
In any case, calling nvme_tcp_error_recovery() twice is harmless;
the first call moves the controller to the resetting state, the second
call is ignored.
Maurizio
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: Re: nvme-tcp: fix a possible UAF when failing to send request
2025-03-11 10:52 ` Maurizio Lombardi
2025-03-13 1:48 ` zhang.guanghui
@ 2025-03-28 9:24 ` zhang.guanghui
2025-04-01 12:11 ` Maurizio Lombardi
1 sibling, 1 reply; 28+ messages in thread
From: zhang.guanghui @ 2025-03-28 9:24 UTC (permalink / raw)
To: Maurizio Lombardi, sagi, mgurtovoy, kbusch, sashal, chunguang.xu
Cc: linux-kernel, linux-nvme, linux-block
Hi,
So far, the UAF issue have not been occcurred during testing this patch in the same situation.
Additionally, the patch triggers a compilation warning, nvme_tcp_fail_request is declared but not defined, please also review.
Thanks
zhang.guanghui@cestc.cn
发件人: Maurizio Lombardi
发送时间: 2025-03-11 18:52
收件人: zhang.guanghui@cestc.cn; sagi; mgurtovoy; kbusch; sashal; chunguang.xu
抄送: linux-kernel; linux-nvme; linux-block
主题: Re: nvme-tcp: fix a possible UAF when failing to send request【请注意,邮件由sagigrim@gmail.com代发】
On Fri Mar 7, 2025 at 11:10 AM CET, zhang.guanghui@cestc.cn wrote:
>
> Hi
>
> After testing this patch, sending request failure occurred, unfortunately, the issue still persists.
Maybe I am completely wrong but I am still quite convinced that the problem here
is that, despite the nvme_tcp_try_send() failure, the target sends a response capsule
for the command, leading to a double-completion in the host.
Sagi, what about taking this patch: https://lore.kernel.org/linux-nvme/20250306160322.1370300-2-mlombard@redhat.com/T/#u
and do a step further by not completing the request, leaving the error
recovery handler the task of cleaning everything up?
diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 327f3f2f5399..72c1d7948386 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -1328,8 +1328,8 @@ static int nvme_tcp_try_send(struct nvme_tcp_queue *queue)
} else if (ret < 0) {
dev_err(queue->ctrl->ctrl.device,
"failed to send request %d\n", ret);
- nvme_tcp_fail_request(queue->request);
nvme_tcp_done_send_req(queue);
+ nvme_tcp_error_recovery(&queue->ctrl->ctrl);
}
out:
memalloc_noreclaim_restore(noreclaim_flag);
Maurizio
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: nvme-tcp: fix a possible UAF when failing to send request
2025-03-28 9:24 ` Re: nvme-tcp: fix a possible UAF when failing to send request zhang.guanghui
@ 2025-04-01 12:11 ` Maurizio Lombardi
0 siblings, 0 replies; 28+ messages in thread
From: Maurizio Lombardi @ 2025-04-01 12:11 UTC (permalink / raw)
To: zhang.guanghui@cestc.cn, sagi, mgurtovoy, kbusch, sashal,
chunguang.xu
Cc: linux-kernel, linux-nvme, linux-block
On Fri Mar 28, 2025 at 10:24 AM CET, zhang.guanghui@cestc.cn wrote:
> Hi,
>
> So far, the UAF issue have not been occcurred during testing this patch in the same situation.
> Additionally, the patch triggers a compilation warning, nvme_tcp_fail_request is declared but not defined, please also review.
>
Ok, I am going to submit a formal patch.
Thanks for testing it,
Maurizio
^ permalink raw reply [flat|nested] 28+ messages in thread
end of thread, other threads:[~2025-04-01 12:18 UTC | newest]
Thread overview: 28+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-10 7:41 nvme-tcp: fix a possible UAF when failing to send request zhang.guanghui
2025-02-10 10:01 ` Maurizio Lombardi
2025-02-10 10:24 ` Max Gurtovoy
2025-02-10 11:16 ` zhang.guanghui
2025-02-10 11:34 ` Max Gurtovoy
2025-02-10 16:40 ` Maurizio Lombardi
[not found] ` <CAAO4dAWdsMjYMp9jdWXd_48aG0mTtVpRONqjJJ1scnc773tHzg@mail.gmail.com>
2025-02-11 8:04 ` zhang.guanghui
2025-02-12 8:11 ` Maurizio Lombardi
2025-02-12 8:23 ` Maurizio Lombardi
2025-02-12 8:52 ` Maurizio Lombardi
2025-02-12 9:47 ` zhang.guanghui
2025-02-12 10:28 ` Maurizio Lombardi
2025-02-12 11:14 ` Maurizio Lombardi
2025-02-12 11:47 ` Maurizio Lombardi
2025-02-12 15:33 ` Maurizio Lombardi
2025-02-12 16:07 ` Maurizio Lombardi
2025-02-13 2:04 ` zhang.guanghui
2025-02-17 7:46 ` Sagi Grimberg
2025-02-20 8:20 ` zhang.guanghui
2025-03-07 10:10 ` Re: nvme-tcp: fix a possible UAF when failing to send request【请注意,邮件由sagigrim@gmail.com代发】 zhang.guanghui
2025-03-11 10:52 ` Maurizio Lombardi
2025-03-13 1:48 ` zhang.guanghui
2025-03-13 7:51 ` Hannes Reinecke
2025-03-13 8:18 ` zhang.guanghui
[not found] ` <2025031316313196627826@cestc.cn>
2025-03-13 9:01 ` Maurizio Lombardi
2025-03-13 8:38 ` zhang.guanghui
2025-03-28 9:24 ` Re: nvme-tcp: fix a possible UAF when failing to send request zhang.guanghui
2025-04-01 12:11 ` Maurizio Lombardi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).