* [PATCH] scsi: libiscsi: fix NOP race condition
@ 2020-09-18 21:09 Lee Duncan
2020-11-04 21:33 ` Mike Christie
0 siblings, 1 reply; 4+ messages in thread
From: Lee Duncan @ 2020-09-18 21:09 UTC (permalink / raw)
To: linux-scsi; +Cc: open-iscsi, Lee Duncan
From: Lee Duncan <lduncan@suse.com>
iSCSI NOPs are sometimes "lost", mistakenly sent to the
user-land iscsid daemon instead of handled in the kernel,
as they should be, resulting in a message from the daemon like:
> iscsid: Got nop in, but kernel supports nop handling.
This can occur because of the new forward- and back-locks,
and the fact that an iSCSI NOP response can occur before
processing of the NOP send is complete. This can result
in "conn->ping_task" being NULL in iscsi_nop_out_rsp(),
when the pointer is actually in the process of being set.
To work around this, we add a new state to the "ping_task"
pointer. In addition to NULL (not assigned) and a pointer
(assigned), we add the state "being set", which is signaled
with an INVALID pointer (using "-1").
Signed-off-by: Lee Duncan <lduncan@suse.com>
---
drivers/scsi/libiscsi.c | 11 ++++++++++-
include/scsi/libiscsi.h | 3 +++
2 files changed, 13 insertions(+), 1 deletion(-)
diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
index 1e9c3171fa9f..5eb064787ee2 100644
--- a/drivers/scsi/libiscsi.c
+++ b/drivers/scsi/libiscsi.c
@@ -738,6 +738,9 @@ __iscsi_conn_send_pdu(struct iscsi_conn *conn, struct iscsi_hdr *hdr,
task->conn->session->age);
}
+ if (unlikely(READ_ONCE(conn->ping_task) == INVALID_SCSI_TASK))
+ WRITE_ONCE(conn->ping_task, task);
+
if (!ihost->workq) {
if (iscsi_prep_mgmt_task(conn, task))
goto free_task;
@@ -941,6 +944,11 @@ static int iscsi_send_nopout(struct iscsi_conn *conn, struct iscsi_nopin *rhdr)
struct iscsi_nopout hdr;
struct iscsi_task *task;
+ if (!rhdr) {
+ if (READ_ONCE(conn->ping_task))
+ return -EINVAL;
+ WRITE_ONCE(conn->ping_task, INVALID_SCSI_TASK);
+ }
if (!rhdr && conn->ping_task)
return -EINVAL;
@@ -957,11 +965,12 @@ static int iscsi_send_nopout(struct iscsi_conn *conn, struct iscsi_nopin *rhdr)
task = __iscsi_conn_send_pdu(conn, (struct iscsi_hdr *)&hdr, NULL, 0);
if (!task) {
+ if (!rhdr)
+ WRITE_ONCE(conn->ping_task, NULL);
iscsi_conn_printk(KERN_ERR, conn, "Could not send nopout\n");
return -EIO;
} else if (!rhdr) {
/* only track our nops */
- conn->ping_task = task;
conn->last_ping = jiffies;
}
diff --git a/include/scsi/libiscsi.h b/include/scsi/libiscsi.h
index c25fb86ffae9..b3bbd10eb3f0 100644
--- a/include/scsi/libiscsi.h
+++ b/include/scsi/libiscsi.h
@@ -132,6 +132,9 @@ struct iscsi_task {
void *dd_data; /* driver/transport data */
};
+/* invalid scsi_task pointer */
+#define INVALID_SCSI_TASK (struct iscsi_task *)-1l
+
static inline int iscsi_task_has_unsol_data(struct iscsi_task *task)
{
return task->unsol_r2t.data_length > task->unsol_r2t.sent;
--
2.26.2
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] scsi: libiscsi: fix NOP race condition
2020-09-18 21:09 [PATCH] scsi: libiscsi: fix NOP race condition Lee Duncan
@ 2020-11-04 21:33 ` Mike Christie
2020-11-04 21:37 ` Mike Christie
2020-11-05 18:30 ` Lee Duncan
0 siblings, 2 replies; 4+ messages in thread
From: Mike Christie @ 2020-11-04 21:33 UTC (permalink / raw)
To: Lee Duncan, linux-scsi; +Cc: open-iscsi, Lee Duncan
On 9/18/20 4:09 PM, Lee Duncan wrote:
> From: Lee Duncan <lduncan@suse.com>
>
> iSCSI NOPs are sometimes "lost", mistakenly sent to the
> user-land iscsid daemon instead of handled in the kernel,
> as they should be, resulting in a message from the daemon like:
>
>> iscsid: Got nop in, but kernel supports nop handling.
>
> This can occur because of the new forward- and back-locks,
> and the fact that an iSCSI NOP response can occur before
> processing of the NOP send is complete. This can result
> in "conn->ping_task" being NULL in iscsi_nop_out_rsp(),
> when the pointer is actually in the process of being set.
>
> To work around this, we add a new state to the "ping_task"
> pointer. In addition to NULL (not assigned) and a pointer
> (assigned), we add the state "being set", which is signaled
> with an INVALID pointer (using "-1").
>
> Signed-off-by: Lee Duncan <lduncan@suse.com>
> ---
> drivers/scsi/libiscsi.c | 11 ++++++++++-
> include/scsi/libiscsi.h | 3 +++
> 2 files changed, 13 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
> index 1e9c3171fa9f..5eb064787ee2 100644
> --- a/drivers/scsi/libiscsi.c
> +++ b/drivers/scsi/libiscsi.c
> @@ -738,6 +738,9 @@ __iscsi_conn_send_pdu(struct iscsi_conn *conn, struct iscsi_hdr *hdr,
> task->conn->session->age);
> }
>
> + if (unlikely(READ_ONCE(conn->ping_task) == INVALID_SCSI_TASK))
> + WRITE_ONCE(conn->ping_task, task);
> +
> if (!ihost->workq) {
> if (iscsi_prep_mgmt_task(conn, task))
> goto free_task;
> @@ -941,6 +944,11 @@ static int iscsi_send_nopout(struct iscsi_conn *conn, struct iscsi_nopin *rhdr)
> struct iscsi_nopout hdr;
> struct iscsi_task *task;
>
> + if (!rhdr) {
> + if (READ_ONCE(conn->ping_task))
> + return -EINVAL;
> + WRITE_ONCE(conn->ping_task, INVALID_SCSI_TASK);
> + }
> if (!rhdr && conn->ping_task)
> return -EINVAL;
>
> @@ -957,11 +965,12 @@ static int iscsi_send_nopout(struct iscsi_conn *conn, struct iscsi_nopin *rhdr)
>
> task = __iscsi_conn_send_pdu(conn, (struct iscsi_hdr *)&hdr, NULL, 0);
> if (!task) {
> + if (!rhdr)
> + WRITE_ONCE(conn->ping_task, NULL);
I don't think you need this. If __iscsi_conn_send_pdu returns NULL, it
will have done __iscsi_put_task and done this already.
> iscsi_conn_printk(KERN_ERR, conn, "Could not send nopout\n");
> return -EIO;
> } else if (!rhdr) {
> /* only track our nops */
> - conn->ping_task = task;
> conn->last_ping = jiffies;
> }
Why in the send path do we always use the READ_ONCE/WRITE_ONCE, but in
the completion path like in iscsi_complete_task we don't.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] scsi: libiscsi: fix NOP race condition
2020-11-04 21:33 ` Mike Christie
@ 2020-11-04 21:37 ` Mike Christie
2020-11-05 18:30 ` Lee Duncan
1 sibling, 0 replies; 4+ messages in thread
From: Mike Christie @ 2020-11-04 21:37 UTC (permalink / raw)
To: Lee Duncan, linux-scsi; +Cc: open-iscsi, Lee Duncan
On 11/4/20 3:33 PM, Mike Christie wrote:
> On 9/18/20 4:09 PM, Lee Duncan wrote:
>> From: Lee Duncan <lduncan@suse.com>
>>
>> iSCSI NOPs are sometimes "lost", mistakenly sent to the
>> user-land iscsid daemon instead of handled in the kernel,
>> as they should be, resulting in a message from the daemon like:
>>
>>> iscsid: Got nop in, but kernel supports nop handling.
>>
>> This can occur because of the new forward- and back-locks,
>> and the fact that an iSCSI NOP response can occur before
>> processing of the NOP send is complete. This can result
>> in "conn->ping_task" being NULL in iscsi_nop_out_rsp(),
>> when the pointer is actually in the process of being set.
>>
>> To work around this, we add a new state to the "ping_task"
>> pointer. In addition to NULL (not assigned) and a pointer
>> (assigned), we add the state "being set", which is signaled
>> with an INVALID pointer (using "-1").
>>
>> Signed-off-by: Lee Duncan <lduncan@suse.com>
>> ---
>> drivers/scsi/libiscsi.c | 11 ++++++++++-
>> include/scsi/libiscsi.h | 3 +++
>> 2 files changed, 13 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
>> index 1e9c3171fa9f..5eb064787ee2 100644
>> --- a/drivers/scsi/libiscsi.c
>> +++ b/drivers/scsi/libiscsi.c
>> @@ -738,6 +738,9 @@ __iscsi_conn_send_pdu(struct iscsi_conn *conn,
>> struct iscsi_hdr *hdr,
>> task->conn->session->age);
>> }
>> + if (unlikely(READ_ONCE(conn->ping_task) == INVALID_SCSI_TASK))
>> + WRITE_ONCE(conn->ping_task, task);
>> +
>> if (!ihost->workq) {
>> if (iscsi_prep_mgmt_task(conn, task))
>> goto free_task;
>> @@ -941,6 +944,11 @@ static int iscsi_send_nopout(struct iscsi_conn
>> *conn, struct iscsi_nopin *rhdr)
>> struct iscsi_nopout hdr;
>> struct iscsi_task *task;
>> + if (!rhdr) {
>> + if (READ_ONCE(conn->ping_task))
>> + return -EINVAL;
>> + WRITE_ONCE(conn->ping_task, INVALID_SCSI_TASK);
>> + }
>> if (!rhdr && conn->ping_task)
>> return -EINVAL;
>> @@ -957,11 +965,12 @@ static int iscsi_send_nopout(struct iscsi_conn
>> *conn, struct iscsi_nopin *rhdr)
>> task = __iscsi_conn_send_pdu(conn, (struct iscsi_hdr *)&hdr,
>> NULL, 0);
>> if (!task) {
>> + if (!rhdr)
>> + WRITE_ONCE(conn->ping_task, NULL);
>
> I don't think you need this. If __iscsi_conn_send_pdu returns NULL, it
> will have done __iscsi_put_task and done this already.
Ignore that. That is iscsi_complete_task that would do it.
>
>> iscsi_conn_printk(KERN_ERR, conn, "Could not send nopout\n");
>> return -EIO;
>> } else if (!rhdr) {
>> /* only track our nops */
>> - conn->ping_task = task;
>> conn->last_ping = jiffies;
>> }
>
> Why in the send path do we always use the READ_ONCE/WRITE_ONCE, but in
> the completion path like in iscsi_complete_task we don't.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] scsi: libiscsi: fix NOP race condition
2020-11-04 21:33 ` Mike Christie
2020-11-04 21:37 ` Mike Christie
@ 2020-11-05 18:30 ` Lee Duncan
1 sibling, 0 replies; 4+ messages in thread
From: Lee Duncan @ 2020-11-05 18:30 UTC (permalink / raw)
To: Mike Christie, linux-scsi; +Cc: open-iscsi
On 11/4/20 1:33 PM, Mike Christie wrote:
> On 9/18/20 4:09 PM, Lee Duncan wrote:
>> From: Lee Duncan <lduncan@suse.com>
>>
>> iSCSI NOPs are sometimes "lost", mistakenly sent to the
>> user-land iscsid daemon instead of handled in the kernel,
>> as they should be, resulting in a message from the daemon like:
>>
>>> iscsid: Got nop in, but kernel supports nop handling.
>>
>> This can occur because of the new forward- and back-locks,
>> and the fact that an iSCSI NOP response can occur before
>> processing of the NOP send is complete. This can result
>> in "conn->ping_task" being NULL in iscsi_nop_out_rsp(),
>> when the pointer is actually in the process of being set.
>>
>> To work around this, we add a new state to the "ping_task"
>> pointer. In addition to NULL (not assigned) and a pointer
>> (assigned), we add the state "being set", which is signaled
>> with an INVALID pointer (using "-1").
>>
>> Signed-off-by: Lee Duncan <lduncan@suse.com>
>> ---
>> drivers/scsi/libiscsi.c | 11 ++++++++++-
>> include/scsi/libiscsi.h | 3 +++
>> 2 files changed, 13 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
>> index 1e9c3171fa9f..5eb064787ee2 100644
>> --- a/drivers/scsi/libiscsi.c
>> +++ b/drivers/scsi/libiscsi.c
>> @@ -738,6 +738,9 @@ __iscsi_conn_send_pdu(struct iscsi_conn *conn,
>> struct iscsi_hdr *hdr,
>> task->conn->session->age);
>> }
>> + if (unlikely(READ_ONCE(conn->ping_task) == INVALID_SCSI_TASK))
>> + WRITE_ONCE(conn->ping_task, task);
>> +
>> if (!ihost->workq) {
>> if (iscsi_prep_mgmt_task(conn, task))
>> goto free_task;
>> @@ -941,6 +944,11 @@ static int iscsi_send_nopout(struct iscsi_conn
>> *conn, struct iscsi_nopin *rhdr)
>> struct iscsi_nopout hdr;
>> struct iscsi_task *task;
>> + if (!rhdr) {
>> + if (READ_ONCE(conn->ping_task))
>> + return -EINVAL;
>> + WRITE_ONCE(conn->ping_task, INVALID_SCSI_TASK);
>> + }
>> if (!rhdr && conn->ping_task)
>> return -EINVAL;
>> @@ -957,11 +965,12 @@ static int iscsi_send_nopout(struct iscsi_conn
>> *conn, struct iscsi_nopin *rhdr)
>> task = __iscsi_conn_send_pdu(conn, (struct iscsi_hdr *)&hdr,
>> NULL, 0);
>> if (!task) {
>> + if (!rhdr)
>> + WRITE_ONCE(conn->ping_task, NULL);
>
> I don't think you need this. If __iscsi_conn_send_pdu returns NULL, it
> will have done __iscsi_put_task and done this already.
Not an issue, as you already replied.
>
>> iscsi_conn_printk(KERN_ERR, conn, "Could not send nopout\n");
>> return -EIO;
>> } else if (!rhdr) {
>> /* only track our nops */
>> - conn->ping_task = task;
>> conn->last_ping = jiffies;
>> }
>
> Why in the send path do we always use the READ_ONCE/WRITE_ONCE, but in
> the completion path like in iscsi_complete_task we don't.
>
The answer is that I was only modifying the code that needed changing
for this bug. My first pass did not use READ_ONCE() or WRITE_ONCE(), but
Hannes suggested the change.
Now that I think about it more, the memory barrier stuff would make
sense only if all the access to that field are protected.
I will resubmit V2 of the patch.
--
Lee Duncan
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2020-11-05 18:30 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-09-18 21:09 [PATCH] scsi: libiscsi: fix NOP race condition Lee Duncan
2020-11-04 21:33 ` Mike Christie
2020-11-04 21:37 ` Mike Christie
2020-11-05 18:30 ` Lee Duncan
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).