* [Qemu-devel] libiscsi task cancellation
@ 2018-02-08 14:08 Stefan Hajnoczi
2018-02-08 14:14 ` Paolo Bonzini
0 siblings, 1 reply; 8+ messages in thread
From: Stefan Hajnoczi @ 2018-02-08 14:08 UTC (permalink / raw)
To: ronnie sahlberg
Cc: Felipe Franciosi, Peter Lieven, Paolo Bonzini, qemu-devel,
Hannes Reinecke
SAM-5, 5.6 Aborting commands says:
A command is aborted when a SCSI device condition (see 6.3), command,
or task management function causes termination of the command prior to
its completion by the device server. After a command is aborted and
TASK ABORTED status, if any, is returned, the SCSI target device shall
send no further requests or responses for that command.
Then "Table 48 — Task management functions that abort commands (part 1
of 2)" goes on to say that no TASK ABORTED status is returned for
aborted commands "if the ABORT TASK scope is I_T_L nexus".
My interpretation is that the target doesn't have to send a response
for the aborted command. It just needs to complete the ABORT TASK
TMF. Afterwards the initiator knows the task is gone and will receive
no more responses.
Is my understanding correct?
Now on to libiscsi:
The iscsi_task_mgmt_async() API documentation says:
* abort_task will also cancel the scsi task. The callback for the
scsi task will be invoked with
* SCSI_STATUS_CANCELLED
I see that the ABORT TASK TMF response invokes the user's
iscsi_task_mgmt_async() callback but not the command callback. I'm
not sure how the command callback is invoked with
SCSI_STATUS_CANCELLED unless libiscsi is relying on the target to send
that response.
Is libiscsi honoring its iscsi_task_mgmt_async() contract?
Thanks,
Stefan
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] libiscsi task cancellation
2018-02-08 14:08 [Qemu-devel] libiscsi task cancellation Stefan Hajnoczi
@ 2018-02-08 14:14 ` Paolo Bonzini
2018-02-08 14:58 ` Hannes Reinecke
2018-02-08 15:08 ` Stefan Hajnoczi
0 siblings, 2 replies; 8+ messages in thread
From: Paolo Bonzini @ 2018-02-08 14:14 UTC (permalink / raw)
To: Stefan Hajnoczi, ronnie sahlberg
Cc: Felipe Franciosi, Peter Lieven, qemu-devel, Hannes Reinecke
On 08/02/2018 15:08, Stefan Hajnoczi wrote:
> Now on to libiscsi:
>
> The iscsi_task_mgmt_async() API documentation says:
>
> * abort_task will also cancel the scsi task. The callback for the
> scsi task will be invoked with
> * SCSI_STATUS_CANCELLED
>
> I see that the ABORT TASK TMF response invokes the user's
> iscsi_task_mgmt_async() callback but not the command callback. I'm
> not sure how the command callback is invoked with
> SCSI_STATUS_CANCELLED unless libiscsi is relying on the target to send
> that response.
>
> Is libiscsi honoring its iscsi_task_mgmt_async() contract?
No, and QEMU is assuming the "wrong" behavior:
static void
iscsi_abort_task_cb(struct iscsi_context *iscsi, int status, void *command_data,
void *private_data)
{
IscsiAIOCB *acb = private_data;
acb->status = -ECANCELED;
iscsi_schedule_bh(acb);
}
Paolo
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] libiscsi task cancellation
2018-02-08 14:14 ` Paolo Bonzini
@ 2018-02-08 14:58 ` Hannes Reinecke
2018-02-08 15:12 ` Stefan Hajnoczi
2018-02-08 15:33 ` Felipe Franciosi
2018-02-08 15:08 ` Stefan Hajnoczi
1 sibling, 2 replies; 8+ messages in thread
From: Hannes Reinecke @ 2018-02-08 14:58 UTC (permalink / raw)
To: Paolo Bonzini, Stefan Hajnoczi, ronnie sahlberg
Cc: Felipe Franciosi, Peter Lieven, qemu-devel
On 02/08/2018 03:14 PM, Paolo Bonzini wrote:
> On 08/02/2018 15:08, Stefan Hajnoczi wrote:
>> Now on to libiscsi:
>>
>> The iscsi_task_mgmt_async() API documentation says:
>>
>> * abort_task will also cancel the scsi task. The callback for the
>> scsi task will be invoked with
>> * SCSI_STATUS_CANCELLED
>>
>> I see that the ABORT TASK TMF response invokes the user's
>> iscsi_task_mgmt_async() callback but not the command callback. I'm
>> not sure how the command callback is invoked with
>> SCSI_STATUS_CANCELLED unless libiscsi is relying on the target to send
>> that response.
>>
>> Is libiscsi honoring its iscsi_task_mgmt_async() contract?
>
> No, and QEMU is assuming the "wrong" behavior:
>
> static void
> iscsi_abort_task_cb(struct iscsi_context *iscsi, int status, void *command_data,
> void *private_data)
> {
> IscsiAIOCB *acb = private_data;
>
> acb->status = -ECANCELED;
> iscsi_schedule_bh(acb);
> }
>
The definition of ABORT TASK TMF in SAM is pretty much useless.
To quote:
A response of FUNCTION COMPLETE shall indicate that the task was aborted
or was not in the task set.
IE we have no idea if we ever managed to abort the task; if the task had
been in-flight by the time we've send the TMF we'll be getting a
FUNCTION COMPLETE, too.
So most FC HBA firmware implement the abort task with just a blacklist;
the TMF will be returned immediately and the command response will be
dropped if and when is arrives.
Cheers,
Hannes
--
Dr. Hannes Reinecke Teamlead Storage & Networking
hare@suse.de +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] libiscsi task cancellation
2018-02-08 14:14 ` Paolo Bonzini
2018-02-08 14:58 ` Hannes Reinecke
@ 2018-02-08 15:08 ` Stefan Hajnoczi
2018-02-08 15:13 ` Stefan Hajnoczi
2018-02-14 16:12 ` Felipe Franciosi
1 sibling, 2 replies; 8+ messages in thread
From: Stefan Hajnoczi @ 2018-02-08 15:08 UTC (permalink / raw)
To: Paolo Bonzini
Cc: ronnie sahlberg, Felipe Franciosi, Peter Lieven, qemu-devel,
Hannes Reinecke
On Thu, Feb 8, 2018 at 2:14 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> On 08/02/2018 15:08, Stefan Hajnoczi wrote:
>> Now on to libiscsi:
>>
>> The iscsi_task_mgmt_async() API documentation says:
>>
>> * abort_task will also cancel the scsi task. The callback for the
>> scsi task will be invoked with
>> * SCSI_STATUS_CANCELLED
>>
>> I see that the ABORT TASK TMF response invokes the user's
>> iscsi_task_mgmt_async() callback but not the command callback. I'm
>> not sure how the command callback is invoked with
>> SCSI_STATUS_CANCELLED unless libiscsi is relying on the target to send
>> that response.
>>
>> Is libiscsi honoring its iscsi_task_mgmt_async() contract?
>
> No, and QEMU is assuming the "wrong" behavior:
>
> static void
> iscsi_abort_task_cb(struct iscsi_context *iscsi, int status, void *command_data,
> void *private_data)
> {
> IscsiAIOCB *acb = private_data;
>
> acb->status = -ECANCELED;
> iscsi_schedule_bh(acb);
> }
FWIW My recent use-after-free fix series is still pending review:
https://lists.gnu.org/archive/html/qemu-devel/2018-02/msg00570.html
It changes the code and fixes a QEMU bug, but it doesn't address the
iscsi_task_mgmt_async() issue we're discussing now.
I think users *can* work around the iscsi_task_mgmt_async() issue as follows:
If the command callback has not been invoked yet when the ABORT TASK
TMF callback is invoked, call iscsi_scsi_cancel_task() so that
libiscsi invokes the command callback with SCSI_STATUS_CANCELLED and
frees the PDU.
Here is how that looks in QEMU:
diff --git a/block/iscsi.c b/block/iscsi.c
index 8140baac15..7b4b43dc55 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -290,8 +290,12 @@ iscsi_abort_task_cb(struct iscsi_context *iscsi,
int status, void *command_data,
{
IscsiAIOCB *acb = private_data;
- acb->status = -ECANCELED;
- iscsi_schedule_bh(acb);
+ /* If the command callback hasn't been called yet, drop the task */
+ if (!acb->bh) {
+ /* Call iscsi_aio_ioctl_cb() with SCSI_STATUS_CANCELLED */
+ iscsi_scsi_cancel_task(iscsi, acb->task);
+ }
+
qemu_aio_unref(acb); /* acquired in iscsi_aio_cancel() */
}
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] libiscsi task cancellation
2018-02-08 14:58 ` Hannes Reinecke
@ 2018-02-08 15:12 ` Stefan Hajnoczi
2018-02-08 15:33 ` Felipe Franciosi
1 sibling, 0 replies; 8+ messages in thread
From: Stefan Hajnoczi @ 2018-02-08 15:12 UTC (permalink / raw)
To: Hannes Reinecke
Cc: Paolo Bonzini, ronnie sahlberg, Felipe Franciosi, Peter Lieven,
qemu-devel
On Thu, Feb 8, 2018 at 2:58 PM, Hannes Reinecke <hare@suse.de> wrote:
> On 02/08/2018 03:14 PM, Paolo Bonzini wrote:
>> On 08/02/2018 15:08, Stefan Hajnoczi wrote:
>>> Now on to libiscsi:
>>>
>>> The iscsi_task_mgmt_async() API documentation says:
>>>
>>> * abort_task will also cancel the scsi task. The callback for the
>>> scsi task will be invoked with
>>> * SCSI_STATUS_CANCELLED
>>>
>>> I see that the ABORT TASK TMF response invokes the user's
>>> iscsi_task_mgmt_async() callback but not the command callback. I'm
>>> not sure how the command callback is invoked with
>>> SCSI_STATUS_CANCELLED unless libiscsi is relying on the target to send
>>> that response.
>>>
>>> Is libiscsi honoring its iscsi_task_mgmt_async() contract?
>>
>> No, and QEMU is assuming the "wrong" behavior:
>>
>> static void
>> iscsi_abort_task_cb(struct iscsi_context *iscsi, int status, void *command_data,
>> void *private_data)
>> {
>> IscsiAIOCB *acb = private_data;
>>
>> acb->status = -ECANCELED;
>> iscsi_schedule_bh(acb);
>> }
>>
> The definition of ABORT TASK TMF in SAM is pretty much useless.
> To quote:
>
> A response of FUNCTION COMPLETE shall indicate that the task was aborted
> or was not in the task set.
>
> IE we have no idea if we ever managed to abort the task; if the task had
> been in-flight by the time we've send the TMF we'll be getting a
> FUNCTION COMPLETE, too.
>
> So most FC HBA firmware implement the abort task with just a blacklist;
> the TMF will be returned immediately and the command response will be
> dropped if and when is arrives.
Great, thanks for confirming.
Stefan
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] libiscsi task cancellation
2018-02-08 15:08 ` Stefan Hajnoczi
@ 2018-02-08 15:13 ` Stefan Hajnoczi
2018-02-14 16:12 ` Felipe Franciosi
1 sibling, 0 replies; 8+ messages in thread
From: Stefan Hajnoczi @ 2018-02-08 15:13 UTC (permalink / raw)
To: Paolo Bonzini
Cc: ronnie sahlberg, Felipe Franciosi, Peter Lieven, qemu-devel,
Hannes Reinecke
On Thu, Feb 8, 2018 at 3:08 PM, Stefan Hajnoczi <stefanha@gmail.com> wrote:
> On Thu, Feb 8, 2018 at 2:14 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>> On 08/02/2018 15:08, Stefan Hajnoczi wrote:
> I think users *can* work around the iscsi_task_mgmt_async() issue as follows:
>
> If the command callback has not been invoked yet when the ABORT TASK
> TMF callback is invoked, call iscsi_scsi_cancel_task() so that
> libiscsi invokes the command callback with SCSI_STATUS_CANCELLED and
> frees the PDU.
>
> Here is how that looks in QEMU:
>
> diff --git a/block/iscsi.c b/block/iscsi.c
> index 8140baac15..7b4b43dc55 100644
> --- a/block/iscsi.c
> +++ b/block/iscsi.c
> @@ -290,8 +290,12 @@ iscsi_abort_task_cb(struct iscsi_context *iscsi,
> int status, void *command_data,
> {
> IscsiAIOCB *acb = private_data;
>
> - acb->status = -ECANCELED;
> - iscsi_schedule_bh(acb);
> + /* If the command callback hasn't been called yet, drop the task */
> + if (!acb->bh) {
> + /* Call iscsi_aio_ioctl_cb() with SCSI_STATUS_CANCELLED */
> + iscsi_scsi_cancel_task(iscsi, acb->task);
> + }
> +
> qemu_aio_unref(acb); /* acquired in iscsi_aio_cancel() */
> }
By the way, this is on top of
https://lists.gnu.org/archive/html/qemu-devel/2018-02/msg00570.html.
Stefan
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] libiscsi task cancellation
2018-02-08 14:58 ` Hannes Reinecke
2018-02-08 15:12 ` Stefan Hajnoczi
@ 2018-02-08 15:33 ` Felipe Franciosi
1 sibling, 0 replies; 8+ messages in thread
From: Felipe Franciosi @ 2018-02-08 15:33 UTC (permalink / raw)
To: Hannes Reinecke
Cc: Paolo Bonzini, Stefan Hajnoczi, ronnie sahlberg, Peter Lieven,
qemu-devel
> On 8 Feb 2018, at 14:58, Hannes Reinecke <hare@suse.de> wrote:
>
> On 02/08/2018 03:14 PM, Paolo Bonzini wrote:
>> On 08/02/2018 15:08, Stefan Hajnoczi wrote:
>>> Now on to libiscsi:
>>>
>>> The iscsi_task_mgmt_async() API documentation says:
>>>
>>> * abort_task will also cancel the scsi task. The callback for the
>>> scsi task will be invoked with
>>> * SCSI_STATUS_CANCELLED
>>>
>>> I see that the ABORT TASK TMF response invokes the user's
>>> iscsi_task_mgmt_async() callback but not the command callback. I'm
>>> not sure how the command callback is invoked with
>>> SCSI_STATUS_CANCELLED unless libiscsi is relying on the target to send
>>> that response.
>>>
>>> Is libiscsi honoring its iscsi_task_mgmt_async() contract?
>>
>> No, and QEMU is assuming the "wrong" behavior:
>>
>> static void
>> iscsi_abort_task_cb(struct iscsi_context *iscsi, int status, void *command_data,
>> void *private_data)
>> {
>> IscsiAIOCB *acb = private_data;
>>
>> acb->status = -ECANCELED;
>> iscsi_schedule_bh(acb);
>> }
>>
> The definition of ABORT TASK TMF in SAM is pretty much useless.
> To quote:
>
> A response of FUNCTION COMPLETE shall indicate that the task was aborted
> or was not in the task set.
>
> IE we have no idea if we ever managed to abort the task; if the task had
> been in-flight by the time we've send the TMF we'll be getting a
> FUNCTION COMPLETE, too.
Why is that a problem? I am under the impression that drivers can cope with that. You can complete the TMF after the original request completed (successfully or not).
> So most FC HBA firmware implement the abort task with just a blacklist;
> the TMF will be returned immediately and the command response will be
> dropped if and when is arrives.
That sounds very dangerous, but maybe it's safe for FC (I don't know details about it). What happens if you complete the TMF immediately and the OS believes the IO (eg. a write) has been aborted, then issue another write for the same LBA and that completes successfully before the original request (which is still outstanding)?
In our implementation, we sit on the TMF and wait for the original request to either complete or abort. Only then we respond the TMF to the issuing OS.
Cheers,
Felipe
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] libiscsi task cancellation
2018-02-08 15:08 ` Stefan Hajnoczi
2018-02-08 15:13 ` Stefan Hajnoczi
@ 2018-02-14 16:12 ` Felipe Franciosi
1 sibling, 0 replies; 8+ messages in thread
From: Felipe Franciosi @ 2018-02-14 16:12 UTC (permalink / raw)
To: Stefan Hajnoczi, Hannes Reinecke, Sreejith Mohanan
Cc: Paolo Bonzini, ronnie sahlberg, Peter Lieven, qemu-devel
> On 8 Feb 2018, at 15:08, Stefan Hajnoczi <stefanha@gmail.com> wrote:
>
> On Thu, Feb 8, 2018 at 2:14 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>> On 08/02/2018 15:08, Stefan Hajnoczi wrote:
>>> Now on to libiscsi:
>>>
>>> The iscsi_task_mgmt_async() API documentation says:
>>>
>>> * abort_task will also cancel the scsi task. The callback for the
>>> scsi task will be invoked with
>>> * SCSI_STATUS_CANCELLED
>>>
>>> I see that the ABORT TASK TMF response invokes the user's
>>> iscsi_task_mgmt_async() callback but not the command callback. I'm
>>> not sure how the command callback is invoked with
>>> SCSI_STATUS_CANCELLED unless libiscsi is relying on the target to send
>>> that response.
>>>
>>> Is libiscsi honoring its iscsi_task_mgmt_async() contract?
>>
>> No, and QEMU is assuming the "wrong" behavior:
>>
>> static void
>> iscsi_abort_task_cb(struct iscsi_context *iscsi, int status, void *command_data,
>> void *private_data)
>> {
>> IscsiAIOCB *acb = private_data;
>>
>> acb->status = -ECANCELED;
>> iscsi_schedule_bh(acb);
>> }
>
> FWIW My recent use-after-free fix series is still pending review:
> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.gnu.org_archive_html_qemu-2Ddevel_2018-2D02_msg00570.html&d=DwIBaQ&c=s883GpUCOChKOHiocYtGcg&r=CCrJKVC5zGot8PrnI-iYe00MdX4pgdQfMRigp14Ptmk&m=E9W0Q_a83TeT4c8PEn--G-Dk9dZ5UGO2CZRZzal0Sak&s=BC2keaBG9mG3yYI_IYPwAoA3_I69KmPZordXnDguF0I&e=
>
> It changes the code and fixes a QEMU bug, but it doesn't address the
> iscsi_task_mgmt_async() issue we're discussing now.
>
> I think users *can* work around the iscsi_task_mgmt_async() issue as follows:
>
> If the command callback has not been invoked yet when the ABORT TASK
> TMF callback is invoked, call iscsi_scsi_cancel_task() so that
> libiscsi invokes the command callback with SCSI_STATUS_CANCELLED and
> frees the PDU.
>
> Here is how that looks in QEMU:
>
> diff --git a/block/iscsi.c b/block/iscsi.c
> index 8140baac15..7b4b43dc55 100644
> --- a/block/iscsi.c
> +++ b/block/iscsi.c
> @@ -290,8 +290,12 @@ iscsi_abort_task_cb(struct iscsi_context *iscsi,
> int status, void *command_data,
> {
> IscsiAIOCB *acb = private_data;
>
> - acb->status = -ECANCELED;
> - iscsi_schedule_bh(acb);
> + /* If the command callback hasn't been called yet, drop the task */
> + if (!acb->bh) {
> + /* Call iscsi_aio_ioctl_cb() with SCSI_STATUS_CANCELLED */
> + iscsi_scsi_cancel_task(iscsi, acb->task);
> + }
> +
> qemu_aio_unref(acb); /* acquired in iscsi_aio_cancel() */
> }
Reviewed-by: Felipe Franciosi <felipe@nutanix.com>
Tested-by: Sreejith Mohanan <sreejit.mohanan@nutanix.com>
We've been over this and Sreejith confirmed this fixes the issue we are seeing. Nevertheless, I think we should continue this conversation to clarify what are the expectations with regards to requests that may still be inflight.
F.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2018-02-14 16:12 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-02-08 14:08 [Qemu-devel] libiscsi task cancellation Stefan Hajnoczi
2018-02-08 14:14 ` Paolo Bonzini
2018-02-08 14:58 ` Hannes Reinecke
2018-02-08 15:12 ` Stefan Hajnoczi
2018-02-08 15:33 ` Felipe Franciosi
2018-02-08 15:08 ` Stefan Hajnoczi
2018-02-08 15:13 ` Stefan Hajnoczi
2018-02-14 16:12 ` Felipe Franciosi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).