[PATCH] hw/nvme: actually implement abort

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [PATCH] hw/nvme: actually implement abort
       [not found] <CGME20240702133144epcas5p22b982613bfbfce0e7ad0c74fd72a7956@epcas5p2.samsung.com>
@ 2024-07-02  8:02 ` Ayush Mishra
  2024-07-02 15:24   ` Keith Busch
  0 siblings, 1 reply; 5+ messages in thread
From: Ayush Mishra @ 2024-07-02  8:02 UTC (permalink / raw)
  To: qemu-devel; +Cc: kbusch, its, foss, Ayush Mishra

Abort was not implemented previously, but we can implement it for AERs and asynchrnously for I/O.

Signed-off-by: Ayush Mishra <ayush.m55@samsung.com>
---
 hw/nvme/ctrl.c | 32 ++++++++++++++++++++++++++++++++
 1 file changed, 32 insertions(+)

diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
index 127c3d2383..a38037b5f8 100644
--- a/hw/nvme/ctrl.c
+++ b/hw/nvme/ctrl.c
@@ -1759,6 +1759,10 @@ static void nvme_aio_err(NvmeRequest *req, int ret)
         break;
     }
 
+    if (ret == -ECANCELED) {
+        status = NVME_CMD_ABORT_REQ;
+    }
+
     trace_pci_nvme_err_aio(nvme_cid(req), strerror(-ret), status);
 
     error_setg_errno(&local_err, -ret, "aio failed");
@@ -5759,12 +5763,40 @@ static uint16_t nvme_identify(NvmeCtrl *n, NvmeRequest *req)
 static uint16_t nvme_abort(NvmeCtrl *n, NvmeRequest *req)
 {
     uint16_t sqid = le32_to_cpu(req->cmd.cdw10) & 0xffff;
+    uint16_t cid  = (le32_to_cpu(req->cmd.cdw10) >> 16) & 0xffff;
+    NvmeSQueue *sq = n->sq[sqid];
+    NvmeRequest *r, *next;
+    int i;
 
     req->cqe.result = 1;
     if (nvme_check_sqid(n, sqid)) {
         return NVME_INVALID_FIELD | NVME_DNR;
     }
 
+    if (sqid == 0) {
+        for (i = 0; i < n->outstanding_aers; i++) {
+            NvmeRequest *re = n->aer_reqs[i];
+            if (re->cqe.cid == cid) {
+                memmove(n->aer_reqs + i, n->aer_reqs + i + 1,
+                         (n->outstanding_aers - i - 1) * sizeof(NvmeRequest *));
+                n->outstanding_aers--;
+                re->status = NVME_CMD_ABORT_REQ;
+                req->cqe.result = 0;
+                nvme_enqueue_req_completion(&n->admin_cq, re);
+                return NVME_SUCCESS;
+            }
+        }
+    }
+
+    QTAILQ_FOREACH_SAFE(r, &sq->out_req_list, entry, next) {
+        if (r->cqe.cid == cid) {
+            if (r->aiocb) {
+                blk_aio_cancel_async(r->aiocb);
+            }
+            break;
+        }
+    }
+
     return NVME_SUCCESS;
 }
 
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] hw/nvme: actually implement abort
  2024-07-02  8:02 ` [PATCH] hw/nvme: actually implement abort Ayush Mishra
@ 2024-07-02 15:24   ` Keith Busch
  2024-07-02 18:55     ` Klaus Jensen
  0 siblings, 1 reply; 5+ messages in thread
From: Keith Busch @ 2024-07-02 15:24 UTC (permalink / raw)
  To: Ayush Mishra; +Cc: qemu-devel, its, foss

On Tue, Jul 02, 2024 at 01:32:32PM +0530, Ayush Mishra wrote:
> Abort was not implemented previously, but we can implement it for AERs and asynchrnously for I/O.

Not implemented for a reason. The target has no idea if the CID the
host requested to be aborted is from the same context that the target
has. Target may have previoulsy completed it, and the host re-issued a
new command after the abort, and due to the queueing could have been
observed in a different order, and now you aborted the wrong command.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] hw/nvme: actually implement abort
  2024-07-02 15:24   ` Keith Busch
@ 2024-07-02 18:55     ` Klaus Jensen
  2024-07-10  9:09       ` Klaus Jensen
  0 siblings, 1 reply; 5+ messages in thread
From: Klaus Jensen @ 2024-07-02 18:55 UTC (permalink / raw)
  To: Keith Busch; +Cc: Ayush Mishra, qemu-devel, foss

[-- Attachment #1: Type: text/plain, Size: 1239 bytes --]

On Jul  2 09:24, Keith Busch wrote:
> On Tue, Jul 02, 2024 at 01:32:32PM +0530, Ayush Mishra wrote:
> > Abort was not implemented previously, but we can implement it for AERs and asynchrnously for I/O.
> 
> Not implemented for a reason. The target has no idea if the CID the
> host requested to be aborted is from the same context that the target
> has. Target may have previoulsy completed it, and the host re-issued a
> new command after the abort, and due to the queueing could have been
> observed in a different order, and now you aborted the wrong command.

I might be missing something here, but are you saying that the Abort
command is fundamentally flawed? Isn't this a host issue? The Abort is
for a specific CID on a specific SQID. The host *should* not screw this
up and reuse a CID it has an outstanding Abort on?

I don't think there are a lot of I/O commands that a host would be able
to cancel (in QEMU, not at all, because only the iscsi backend
actually implements blk_aio_cancel_async). But some commands that issue
multiple AIOs, like Copy, may be long running and with this it can
actually be cancelled.

And with regards to AERs, I don't see why it is not advantageous to be
able to Abort one?

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] hw/nvme: actually implement abort
  2024-07-02 18:55     ` Klaus Jensen
@ 2024-07-10  9:09       ` Klaus Jensen
  2024-07-11 15:01         ` Keith Busch
  0 siblings, 1 reply; 5+ messages in thread
From: Klaus Jensen @ 2024-07-10  9:09 UTC (permalink / raw)
  To: Keith Busch; +Cc: Ayush Mishra, qemu-devel, foss

[-- Attachment #1: Type: text/plain, Size: 1355 bytes --]

On Jul  2 20:55, Klaus Jensen wrote:
> On Jul  2 09:24, Keith Busch wrote:
> > On Tue, Jul 02, 2024 at 01:32:32PM +0530, Ayush Mishra wrote:
> > > Abort was not implemented previously, but we can implement it for AERs and asynchrnously for I/O.
> > 
> > Not implemented for a reason. The target has no idea if the CID the
> > host requested to be aborted is from the same context that the target
> > has. Target may have previoulsy completed it, and the host re-issued a
> > new command after the abort, and due to the queueing could have been
> > observed in a different order, and now you aborted the wrong command.
> 
> I might be missing something here, but are you saying that the Abort
> command is fundamentally flawed? Isn't this a host issue? The Abort is
> for a specific CID on a specific SQID. The host *should* not screw this
> up and reuse a CID it has an outstanding Abort on?
> 
> I don't think there are a lot of I/O commands that a host would be able
> to cancel (in QEMU, not at all, because only the iscsi backend
> actually implements blk_aio_cancel_async). But some commands that issue
> multiple AIOs, like Copy, may be long running and with this it can
> actually be cancelled.
> 
> And with regards to AERs, I don't see why it is not advantageous to be
> able to Abort one?

Keith, any thoughts on this?

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] hw/nvme: actually implement abort
  2024-07-10  9:09       ` Klaus Jensen
@ 2024-07-11 15:01         ` Keith Busch
  0 siblings, 0 replies; 5+ messages in thread
From: Keith Busch @ 2024-07-11 15:01 UTC (permalink / raw)
  To: Klaus Jensen; +Cc: Ayush Mishra, qemu-devel, foss

On Wed, Jul 10, 2024 at 11:09:43AM +0200, Klaus Jensen wrote:
> On Jul  2 20:55, Klaus Jensen wrote:
> > On Jul  2 09:24, Keith Busch wrote:
> > > On Tue, Jul 02, 2024 at 01:32:32PM +0530, Ayush Mishra wrote:
> > > > Abort was not implemented previously, but we can implement it for AERs and asynchrnously for I/O.
> > > 
> > > Not implemented for a reason. The target has no idea if the CID the
> > > host requested to be aborted is from the same context that the target
> > > has. Target may have previoulsy completed it, and the host re-issued a
> > > new command after the abort, and due to the queueing could have been
> > > observed in a different order, and now you aborted the wrong command.
> > 
> > I might be missing something here, but are you saying that the Abort
> > command is fundamentally flawed? Isn't this a host issue? The Abort is
> > for a specific CID on a specific SQID. The host *should* not screw this
> > up and reuse a CID it has an outstanding Abort on?
> > 
> > I don't think there are a lot of I/O commands that a host would be able
> > to cancel (in QEMU, not at all, because only the iscsi backend
> > actually implements blk_aio_cancel_async). But some commands that issue
> > multiple AIOs, like Copy, may be long running and with this it can
> > actually be cancelled.
> > 
> > And with regards to AERs, I don't see why it is not advantageous to be
> > able to Abort one?
> 
> Keith, any thoughts on this?

Oh, you can take this if you want, I'm just mentioning the pitfalls with
the abort command. While sequestoring command id's that are being
aborted may be good practice for the host, the spec doesn't say anything
about it. The Linux driver doesn't do that at least, though it recently
created a different mechanism to avoid immediate command id reuse.


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-07-11 15:02 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <CGME20240702133144epcas5p22b982613bfbfce0e7ad0c74fd72a7956@epcas5p2.samsung.com>
2024-07-02  8:02 ` [PATCH] hw/nvme: actually implement abort Ayush Mishra
2024-07-02 15:24   ` Keith Busch
2024-07-02 18:55     ` Klaus Jensen
2024-07-10  9:09       ` Klaus Jensen
2024-07-11 15:01         ` Keith Busch

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).