From: Niklas Cassel <Niklas.Cassel@wdc.com>
To: Dmitry Fomichev <Dmitry.Fomichev@wdc.com>
Cc: "Kevin Wolf" <kwolf@redhat.com>, "Fam Zheng" <fam@euphon.net>,
"Damien Le Moal" <Damien.LeMoal@wdc.com>,
"qemu-block@nongnu.org" <qemu-block@nongnu.org>,
"Klaus Jensen" <k.jensen@samsung.com>,
"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
"Maxim Levitsky" <mlevitsk@redhat.com>,
"Alistair Francis" <Alistair.Francis@wdc.com>,
"Keith Busch" <kbusch@kernel.org>,
"Max Reitz" <mreitz@redhat.com>,
"Philippe Mathieu-Daudé" <philmd@redhat.com>,
"Matias Bjorling" <Matias.Bjorling@wdc.com>
Subject: Re: [PATCH v9 08/12] hw/block/nvme: Support Zoned Namespace Command Set
Date: Fri, 6 Nov 2020 11:59:03 +0000 [thread overview]
Message-ID: <20201106115903.GA345539@localhost.localdomain> (raw)
In-Reply-To: <20201105025342.9037-9-dmitry.fomichev@wdc.com>
On Thu, Nov 05, 2020 at 11:53:38AM +0900, Dmitry Fomichev wrote:
> The emulation code has been changed to advertise NVM Command Set when
> "zoned" device property is not set (default) and Zoned Namespace
> Command Set otherwise.
>
> Define values and structures that are needed to support Zoned
> Namespace Command Set (NVMe TP 4053) in PCI NVMe controller emulator.
> Define trace events where needed in newly introduced code.
>
> In order to improve scalability, all open, closed and full zones
> are organized in separate linked lists. Consequently, almost all
> zone operations don't require scanning of the entire zone array
> (which potentially can be quite large) - it is only necessary to
> enumerate one or more zone lists.
>
> Handlers for three new NVMe commands introduced in Zoned Namespace
> Command Set specification are added, namely for Zone Management
> Receive, Zone Management Send and Zone Append.
>
> Device initialization code has been extended to create a proper
> configuration for zoned operation using device properties.
>
> Read/Write command handler is modified to only allow writes at the
> write pointer if the namespace is zoned. For Zone Append command,
> writes implicitly happen at the write pointer and the starting write
> pointer value is returned as the result of the command. Write Zeroes
> handler is modified to add zoned checks that are identical to those
> done as a part of Write flow.
>
> Subsequent commits in this series add ZDE support and checks for
> active and open zone limits.
>
> Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
> Signed-off-by: Hans Holmberg <hans.holmberg@wdc.com>
> Signed-off-by: Ajay Joshi <ajay.joshi@wdc.com>
> Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
> Signed-off-by: Matias Bjorling <matias.bjorling@wdc.com>
> Signed-off-by: Aravind Ramesh <aravind.ramesh@wdc.com>
> Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
> Signed-off-by: Adam Manzanares <adam.manzanares@wdc.com>
> Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com>
> Reviewed-by: Niklas Cassel <Niklas.Cassel@wdc.com>
> ---
> hw/block/nvme-ns.h | 54 +++
> hw/block/nvme.h | 8 +
> hw/block/nvme-ns.c | 173 ++++++++
> hw/block/nvme.c | 971 +++++++++++++++++++++++++++++++++++++++++-
> hw/block/trace-events | 18 +-
> 5 files changed, 1209 insertions(+), 15 deletions(-)
>
(snip)
> +static uint16_t nvme_zone_mgmt_recv(NvmeCtrl *n, NvmeRequest *req)
> +{
> + NvmeCmd *cmd = (NvmeCmd *)&req->cmd;
> + NvmeNamespace *ns = req->ns;
> + /* cdw12 is zero-based number of dwords to return. Convert to bytes */
> + uint32_t data_size = (le32_to_cpu(cmd->cdw12) + 1) << 2;
> + uint32_t dw13 = le32_to_cpu(cmd->cdw13);
> + uint32_t zone_idx, zra, zrasf, partial;
> + uint64_t max_zones, nr_zones = 0;
> + uint16_t ret;
> + uint64_t slba;
> + NvmeZoneDescr *z;
> + NvmeZone *zs;
> + NvmeZoneReportHeader *header;
> + void *buf, *buf_p;
> + size_t zone_entry_sz;
> +
> + req->status = NVME_SUCCESS;
> +
> + ret = nvme_get_mgmt_zone_slba_idx(ns, cmd, &slba, &zone_idx);
> + if (ret) {
> + return ret;
> + }
> +
> + zra = dw13 & 0xff;
> + if (zra != NVME_ZONE_REPORT) {
> + return NVME_INVALID_FIELD | NVME_DNR;
> + }
> +
> + zrasf = (dw13 >> 8) & 0xff;
> + if (zrasf > NVME_ZONE_REPORT_OFFLINE) {
> + return NVME_INVALID_FIELD | NVME_DNR;
> + }
> +
> + if (data_size < sizeof(NvmeZoneReportHeader)) {
> + return NVME_INVALID_FIELD | NVME_DNR;
> + }
> +
> + ret = nvme_map_dptr(n, data_size, req);
nvme_map_dptr() call was not here in v8 patch set.
On v7 I commented that you were missing a call to nvme_check_mdts().
I think you still need to call nvme_check_mdts in order to verify
that data_size < mdts, no?
This function already has a call do nvme_dma(). nvme_dma() already
calls nvme_map_dptr().
If you use nvme_dma(), you cannot use nvme_map_dptr().
It will call nvme_map_addr() (which calls qemu_sglist_add()) on the
same buffer twice, causing the qsg->size to be twice what the user
sent in. Which will cause the:
if (unlikely(residual)) {
check in nvme_dma() to fail.
Looking at nvme_read()/nvme_write(), they use nvme_map_dptr()
(without any call to nvme_dma()), and then use dma_blk_read() or
dma_blk_write(). (and they both call nvme_check_mdts())
Kind regards,
Niklas
> + if (ret) {
> + return ret;
> + }
> +
> + partial = (dw13 >> 16) & 0x01;
> +
> + zone_entry_sz = sizeof(NvmeZoneDescr);
> +
> + max_zones = (data_size - sizeof(NvmeZoneReportHeader)) / zone_entry_sz;
> + buf = g_malloc0(data_size);
> +
> + header = (NvmeZoneReportHeader *)buf;
> + buf_p = buf + sizeof(NvmeZoneReportHeader);
> +
> + while (zone_idx < ns->num_zones && nr_zones < max_zones) {
> + zs = &ns->zone_array[zone_idx];
> +
> + if (!nvme_zone_matches_filter(zrasf, zs)) {
> + zone_idx++;
> + continue;
> + }
> +
> + z = (NvmeZoneDescr *)buf_p;
> + buf_p += sizeof(NvmeZoneDescr);
> + nr_zones++;
> +
> + z->zt = zs->d.zt;
> + z->zs = zs->d.zs;
> + z->zcap = cpu_to_le64(zs->d.zcap);
> + z->zslba = cpu_to_le64(zs->d.zslba);
> + z->za = zs->d.za;
> +
> + if (nvme_wp_is_valid(zs)) {
> + z->wp = cpu_to_le64(zs->d.wp);
> + } else {
> + z->wp = cpu_to_le64(~0ULL);
> + }
> +
> + zone_idx++;
> + }
> +
> + if (!partial) {
> + for (; zone_idx < ns->num_zones; zone_idx++) {
> + zs = &ns->zone_array[zone_idx];
> + if (nvme_zone_matches_filter(zrasf, zs)) {
> + nr_zones++;
> + }
> + }
> + }
> + header->nr_zones = cpu_to_le64(nr_zones);
> +
> + ret = nvme_dma(n, (uint8_t *)buf, data_size,
> + DMA_DIRECTION_FROM_DEVICE, req);
> +
> + g_free(buf);
> +
> + return ret;
> +}
> +
next prev parent reply other threads:[~2020-11-06 12:00 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-11-05 2:53 [PATCH v9 00/12] hw/block/nvme: Support Namespace Types and Zoned Namespace Command Set Dmitry Fomichev
2020-11-05 2:53 ` [PATCH v9 01/12] hw/block/nvme: Add Commands Supported and Effects log Dmitry Fomichev
2020-11-05 2:53 ` [PATCH v9 02/12] hw/block/nvme: Generate namespace UUIDs Dmitry Fomichev
2020-11-05 6:36 ` Klaus Jensen
2020-11-05 2:53 ` [PATCH v9 03/12] hw/block/nvme: Separate read and write handlers Dmitry Fomichev
2020-11-05 2:53 ` [PATCH v9 04/12] hw/block/nvme: Merge nvme_write_zeroes() with nvme_write() Dmitry Fomichev
2020-11-05 2:53 ` [PATCH v9 05/12] hw/block/nvme: Add support for Namespace Types Dmitry Fomichev
2020-11-05 2:53 ` [PATCH v9 06/12] hw/block/nvme: Support allocated CNS command variants Dmitry Fomichev
2020-11-05 2:53 ` [PATCH v9 07/12] block/nvme: Make ZNS-related definitions Dmitry Fomichev
2020-11-05 2:53 ` [PATCH v9 08/12] hw/block/nvme: Support Zoned Namespace Command Set Dmitry Fomichev
2020-11-06 11:59 ` Niklas Cassel [this message]
2020-11-06 23:07 ` Dmitry Fomichev
2020-11-12 19:36 ` Klaus Jensen
2020-11-18 0:32 ` Keith Busch
2020-11-25 21:12 ` Klaus Jensen
2020-12-08 20:02 ` Dmitry Fomichev
2020-12-08 20:20 ` Klaus Jensen
2020-11-05 2:53 ` [PATCH v9 09/12] hw/block/nvme: Introduce max active and open zone limits Dmitry Fomichev
2020-11-05 2:53 ` [PATCH v9 10/12] hw/block/nvme: Support Zone Descriptor Extensions Dmitry Fomichev
2020-11-05 2:53 ` [PATCH v9 11/12] hw/block/nvme: Add injection of Offline/Read-Only zones Dmitry Fomichev
2020-11-05 2:53 ` [PATCH v9 12/12] hw/block/nvme: Document zoned parameters in usage text Dmitry Fomichev
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20201106115903.GA345539@localhost.localdomain \
--to=niklas.cassel@wdc.com \
--cc=Alistair.Francis@wdc.com \
--cc=Damien.LeMoal@wdc.com \
--cc=Dmitry.Fomichev@wdc.com \
--cc=Matias.Bjorling@wdc.com \
--cc=fam@euphon.net \
--cc=k.jensen@samsung.com \
--cc=kbusch@kernel.org \
--cc=kwolf@redhat.com \
--cc=mlevitsk@redhat.com \
--cc=mreitz@redhat.com \
--cc=philmd@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).