* [PATCH V2 0/4] hw/block/nvme: Implement ZNS finish-zone ZDC AEN
@ 2022-10-21 23:10 clay.mayers
2022-10-21 23:10 ` [PATCH V2 1/4] hw/block/nvme: add ZONE_FINISH_RECOMMENDED functionality clay.mayers
` (5 more replies)
0 siblings, 6 replies; 8+ messages in thread
From: clay.mayers @ 2022-10-21 23:10 UTC (permalink / raw)
To: qemu-devel
Cc: Keith Busch, Klaus Jensen, Fam Zheng, Phlippe Mathieu-Daudé
From: Clay Mayers <clay.mayers@kioxia.com>
ZNS controllers have the option to limit the time a zone can remain in
the active state. It begins with a background process in the controller
setting the finish-zone-recommended FZR attribute for a zone. As part of
setting this attribute, the zone's id is added to the namespace's
zone-descriptor-changed (ZDC) log page. If enabled, items added to the
ZDC log page generate a ZDC "asynchronous event notification" AEN. Optionally,
the control can induce a "zone excursion" forcing the zone into the finished
state that also generates a ZDC event.
Zone enabled applications need to properly handle ZDC events. In a real device,
the timeout is many hours making testing an application difficult.
Implemented is the generation of FZR ZDC events to speed up O/S and application
testing.
Added to the zoned NVMe command set is an optional, per-namespace timer
(zoned.finish_time) to set the FZR attr for long-lived active zones; A per
namespace ZDC log page; AEN results to including CQE.DW1 (the NSID of the ZDC
AEN) and generating a ZDC AEN if it's been enabled. Zone excursions are not
modeled.
See section 5.5 of the NVMe Zoned Namespace Command Set Specification v1.1
for more details.
Changes since v1
- Fixed offset length checking in zdc log page
- Moved zdc_event_queued to the patch 4
- Unwatched zdc events in nvme_exit()
Clay Mayers (4):
hw/block/nvme: add ZONE_FINISH_RECOMMENDED functionality
hw/block/nvme: add zone descriptor changed log page
hw/block/nvme: supply dw1 for aen result
hw/block/nvme: add zone descriptor changed AEN
docs/system/devices/nvme.rst | 5 +
hw/nvme/ctrl.c | 174 +++++++++++++++++++++++++++++++++--
hw/nvme/ns.c | 15 +++
hw/nvme/nvme.h | 37 +++++++-
hw/nvme/trace-events | 3 +-
include/block/nvme.h | 14 ++-
6 files changed, 233 insertions(+), 15 deletions(-)
--
2.27.0
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH V2 1/4] hw/block/nvme: add ZONE_FINISH_RECOMMENDED functionality
2022-10-21 23:10 [PATCH V2 0/4] hw/block/nvme: Implement ZNS finish-zone ZDC AEN clay.mayers
@ 2022-10-21 23:10 ` clay.mayers
2022-10-21 23:10 ` [PATCH V2 2/4] hw/block/nvme: add zone descriptor changed log page clay.mayers
` (4 subsequent siblings)
5 siblings, 0 replies; 8+ messages in thread
From: clay.mayers @ 2022-10-21 23:10 UTC (permalink / raw)
To: qemu-devel
Cc: Keith Busch, Klaus Jensen, Fam Zheng, Phlippe Mathieu-Daudé
From: Clay Mayers <clay.mayers@kioxia.com>
Adds ns.param.zoned.finish_time, which sets the number of
seconds a zone can remain active before the zone attribute
ZONE_FINISH_RECOMMENDED is set.
This requires scanning the exp open, imp open and closed lists
of zones whenever a zone is marked as requiring finishing. The
expectation is these lists will be short (10s of items) allowing a
simpler implementation than keeping the lists sorted. It also
keeps the overhead during the exception of a timeout instead of
when zones change state between open and closed. For use cases
where this isn't true, finish_time should be 0 to disable this
feature (the default).
Signed-off-by: Clay Mayers <clay.mayers@kioxia.com>
---
docs/system/devices/nvme.rst | 5 +++++
hw/nvme/ctrl.c | 35 +++++++++++++++++++++++++++++++++++
hw/nvme/ns.c | 8 ++++++++
hw/nvme/nvme.h | 18 ++++++++++++++----
4 files changed, 62 insertions(+), 4 deletions(-)
diff --git a/docs/system/devices/nvme.rst b/docs/system/devices/nvme.rst
index 30f841ef62..1cb0ef844c 100644
--- a/docs/system/devices/nvme.rst
+++ b/docs/system/devices/nvme.rst
@@ -212,6 +212,11 @@ The namespace may be configured with additional parameters
the minimum memory page size (CAP.MPSMIN). The default value (``0``)
has this property inherit the ``mdts`` value.
+``zoned.finish_time=UINT32`` (default: ``0``)
+ Set the time in seconds for how long a zone can be active before setting the
+ zone attribute ``Zone Finish Recommended``. The default value (``0``)
+ disables this feature.
+
Metadata
--------
diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
index 87aeba0564..d7e9fae0b0 100644
--- a/hw/nvme/ctrl.c
+++ b/hw/nvme/ctrl.c
@@ -1516,6 +1516,38 @@ static void nvme_clear_events(NvmeCtrl *n, uint8_t event_type)
}
}
+static void nvme_check_finish(NvmeNamespace *ns, NvmeZoneListHead *list)
+{
+ int64_t now = qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL);
+ NvmeZone *zone;
+
+ QTAILQ_FOREACH(zone, list, entry) {
+ if (zone->finish_ms <= now) {
+ zone->finish_ms = INT64_MAX;
+ zone->d.za |= NVME_ZA_FINISH_RECOMMENDED;
+ } else if (zone->finish_ms != INT64_MAX) {
+ timer_mod_anticipate(ns->active_timer, zone->finish_ms);
+ }
+ }
+}
+
+void nvme_finish_needed(void *opaque)
+{
+ NvmeNamespace *ns = opaque;
+
+ nvme_check_finish(ns, &ns->exp_open_zones);
+ nvme_check_finish(ns, &ns->imp_open_zones);
+ nvme_check_finish(ns, &ns->closed_zones);
+}
+
+void nvme_set_active_timeout(NvmeNamespace *ns, NvmeZone *zone)
+{
+ if (ns->fto_ms) {
+ zone->finish_ms = qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + ns->fto_ms;
+ timer_mod_anticipate(ns->active_timer, zone->finish_ms);
+ }
+}
+
static inline uint16_t nvme_check_mdts(NvmeCtrl *n, size_t len)
{
uint8_t mdts = n->params.mdts;
@@ -1791,6 +1823,7 @@ static uint16_t nvme_zrm_finish(NvmeNamespace *ns, NvmeZone *zone)
/* fallthrough */
case NVME_ZONE_STATE_EMPTY:
+ zone->d.za &= ~NVME_ZA_FINISH_RECOMMENDED;
nvme_assign_zone_state(ns, zone, NVME_ZONE_STATE_FULL);
return NVME_SUCCESS;
@@ -1891,6 +1924,7 @@ static uint16_t nvme_zrm_open_flags(NvmeCtrl *n, NvmeNamespace *ns,
if (act) {
nvme_aor_inc_active(ns);
+ nvme_set_active_timeout(ns, zone);
}
nvme_aor_inc_open(ns);
@@ -3619,6 +3653,7 @@ static uint16_t nvme_set_zd_ext(NvmeNamespace *ns, NvmeZone *zone)
return status;
}
nvme_aor_inc_active(ns);
+ nvme_set_active_timeout(ns, zone);
zone->d.za |= NVME_ZA_ZD_EXT_VALID;
nvme_assign_zone_state(ns, zone, NVME_ZONE_STATE_CLOSED);
return NVME_SUCCESS;
diff --git a/hw/nvme/ns.c b/hw/nvme/ns.c
index 62a1f97be0..b577f2d8e0 100644
--- a/hw/nvme/ns.c
+++ b/hw/nvme/ns.c
@@ -322,6 +322,11 @@ static void nvme_ns_init_zoned(NvmeNamespace *ns)
ns->id_ns.nsfeat &= ~0x4;
}
+ ns->fto_ms = ns->params.fto * INT64_C(1000);
+ if (ns->fto_ms) {
+ ns->active_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL, nvme_finish_needed,
+ ns);
+ }
ns->id_ns_zoned = id_ns_z;
}
@@ -338,6 +343,7 @@ static void nvme_clear_zone(NvmeNamespace *ns, NvmeZone *zone)
nvme_set_zone_state(zone, NVME_ZONE_STATE_CLOSED);
}
nvme_aor_inc_active(ns);
+ nvme_set_active_timeout(ns, zone);
QTAILQ_INSERT_HEAD(&ns->closed_zones, zone, entry);
} else {
trace_pci_nvme_clear_ns_reset(state, zone->d.zslba);
@@ -521,6 +527,7 @@ void nvme_ns_shutdown(NvmeNamespace *ns)
void nvme_ns_cleanup(NvmeNamespace *ns)
{
if (ns->params.zoned) {
+ timer_free(ns->active_timer);
g_free(ns->id_ns_zoned);
g_free(ns->zone_array);
g_free(ns->zd_extensions);
@@ -644,6 +651,7 @@ static Property nvme_ns_props[] = {
DEFINE_PROP_SIZE("zoned.zrwafg", NvmeNamespace, params.zrwafg, -1),
DEFINE_PROP_BOOL("eui64-default", NvmeNamespace, params.eui64_default,
false),
+ DEFINE_PROP_UINT32("zoned.finish_time", NvmeNamespace, params.fto, 0),
DEFINE_PROP_END_OF_LIST(),
};
diff --git a/hw/nvme/nvme.h b/hw/nvme/nvme.h
index 79f5c281c2..9a54dcdb32 100644
--- a/hw/nvme/nvme.h
+++ b/hw/nvme/nvme.h
@@ -93,6 +93,7 @@ static inline NvmeNamespace *nvme_subsys_ns(NvmeSubsystem *subsys,
typedef struct NvmeZone {
NvmeZoneDescr d;
uint64_t w_ptr;
+ int64_t finish_ms;
QTAILQ_ENTRY(NvmeZone) entry;
} NvmeZone;
@@ -121,12 +122,15 @@ typedef struct NvmeNamespaceParams {
uint32_t max_active_zones;
uint32_t max_open_zones;
uint32_t zd_extension_size;
+ uint32_t fto;
uint32_t numzrwa;
uint64_t zrwas;
uint64_t zrwafg;
} NvmeNamespaceParams;
+typedef QTAILQ_HEAD(, NvmeZone) NvmeZoneListHead;
+
typedef struct NvmeNamespace {
DeviceState parent_obj;
BlockConf blkconf;
@@ -154,10 +158,10 @@ typedef struct NvmeNamespace {
NvmeIdNsZoned *id_ns_zoned;
NvmeZone *zone_array;
- QTAILQ_HEAD(, NvmeZone) exp_open_zones;
- QTAILQ_HEAD(, NvmeZone) imp_open_zones;
- QTAILQ_HEAD(, NvmeZone) closed_zones;
- QTAILQ_HEAD(, NvmeZone) full_zones;
+ NvmeZoneListHead exp_open_zones;
+ NvmeZoneListHead imp_open_zones;
+ NvmeZoneListHead closed_zones;
+ NvmeZoneListHead full_zones;
uint32_t num_zones;
uint64_t zone_size;
uint64_t zone_capacity;
@@ -166,6 +170,9 @@ typedef struct NvmeNamespace {
int32_t nr_open_zones;
int32_t nr_active_zones;
+ int64_t fto_ms;
+ QEMUTimer *active_timer;
+
NvmeNamespaceParams params;
struct {
@@ -274,6 +281,9 @@ static inline void nvme_aor_dec_active(NvmeNamespace *ns)
assert(ns->nr_active_zones >= 0);
}
+void nvme_set_active_timeout(NvmeNamespace *ns, NvmeZone *zone);
+void nvme_finish_needed(void *opaque);
+
void nvme_ns_init_format(NvmeNamespace *ns);
int nvme_ns_setup(NvmeNamespace *ns, Error **errp);
void nvme_ns_drain(NvmeNamespace *ns);
--
2.27.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH V2 2/4] hw/block/nvme: add zone descriptor changed log page
2022-10-21 23:10 [PATCH V2 0/4] hw/block/nvme: Implement ZNS finish-zone ZDC AEN clay.mayers
2022-10-21 23:10 ` [PATCH V2 1/4] hw/block/nvme: add ZONE_FINISH_RECOMMENDED functionality clay.mayers
@ 2022-10-21 23:10 ` clay.mayers
2022-10-21 23:10 ` [PATCH V2 3/4] hw/block/nvme: supply dw1 for aen result clay.mayers
` (3 subsequent siblings)
5 siblings, 0 replies; 8+ messages in thread
From: clay.mayers @ 2022-10-21 23:10 UTC (permalink / raw)
To: qemu-devel
Cc: Keith Busch, Klaus Jensen, Fam Zheng, Phlippe Mathieu-Daudé
From: Clay Mayers <clay.mayers@kioxia.com>
Zones marked with ZONE_FINISH_RECOMMENDED are added to the zone
descriptor changed log page. Once read with RAE cleared, they are
removed from the list.
Zones stay in the list regardless of what other states the zones may
go through so applications must be aware of ABA issues where finish
may be recommended, the zone freed and re-opened and now the attribute
is now clear.
Signed-off-by: Clay Mayers <clay.mayers@kioxia.com>
---
hw/nvme/ctrl.c | 57 ++++++++++++++++++++++++++++++++++++++++++++
hw/nvme/ns.c | 6 +++++
hw/nvme/nvme.h | 8 +++++++
hw/nvme/trace-events | 1 +
include/block/nvme.h | 8 +++++++
5 files changed, 80 insertions(+)
diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
index d7e9fae0b0..5b5ceb90c2 100644
--- a/hw/nvme/ctrl.c
+++ b/hw/nvme/ctrl.c
@@ -1516,15 +1516,42 @@ static void nvme_clear_events(NvmeCtrl *n, uint8_t event_type)
}
}
+static void nvme_zdc_list(NvmeNamespace *ns, NvmeZoneIdList *zlist, bool reset)
+{
+ NvmeZdc *zdc;
+ NvmeZdc *next;
+ int index = 0;
+
+ QTAILQ_FOREACH_SAFE(zdc, &ns->zdc_list, entry, next) {
+ if (index >= ARRAY_SIZE(zlist->zids)) {
+ break;
+ }
+ zlist->zids[index++] = zdc->zone->d.zslba;
+ if (reset) {
+ QTAILQ_REMOVE(&ns->zdc_list, zdc, entry);
+ zdc->zone->zdc_entry = NULL;
+ g_free(zdc);
+ }
+ }
+ zlist->nzid = cpu_to_le16(index);
+}
+
static void nvme_check_finish(NvmeNamespace *ns, NvmeZoneListHead *list)
{
int64_t now = qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL);
NvmeZone *zone;
+ NvmeZdc *zdc;
QTAILQ_FOREACH(zone, list, entry) {
if (zone->finish_ms <= now) {
zone->finish_ms = INT64_MAX;
zone->d.za |= NVME_ZA_FINISH_RECOMMENDED;
+ if (!zone->zdc_entry) {
+ zdc = g_malloc0(sizeof(*zdc));
+ zdc->zone = zone;
+ zone->zdc_entry = zdc;
+ QTAILQ_INSERT_TAIL(&ns->zdc_list, zdc, entry);
+ }
} else if (zone->finish_ms != INT64_MAX) {
timer_mod_anticipate(ns->active_timer, zone->finish_ms);
}
@@ -4675,6 +4702,34 @@ static uint16_t nvme_cmd_effects(NvmeCtrl *n, uint8_t csi, uint32_t buf_len,
return nvme_c2h(n, ((uint8_t *)&log) + off, trans_len, req);
}
+static uint16_t nvme_changed_zones(NvmeCtrl *n, uint8_t rae, uint32_t buf_len,
+ uint64_t off, NvmeRequest *req)
+{
+ NvmeNamespace *ns;
+ NvmeCmd *cmd = &req->cmd;
+ uint32_t nsid = le32_to_cpu(cmd->nsid);
+ NvmeZoneIdList zlist = { };
+ uint32_t trans_len;
+
+ if (off >= sizeof(zlist)) {
+ trace_pci_nvme_err_invalid_log_page_offset(off, sizeof(zlist));
+ return NVME_INVALID_FIELD | NVME_DNR;
+ }
+
+ nsid = le32_to_cpu(cmd->nsid);
+ trace_pci_nvme_changed_zones(nsid);
+
+ ns = nvme_ns(n, nsid);
+ if (!ns) {
+ return NVME_INVALID_NSID | NVME_DNR;
+ }
+ nvme_zdc_list(ns, &zlist, !rae);
+
+ trans_len = MIN(sizeof(zlist) - off, buf_len);
+
+ return nvme_c2h(n, ((uint8_t *)&zlist) + off, trans_len, req);
+}
+
static uint16_t nvme_get_log(NvmeCtrl *n, NvmeRequest *req)
{
NvmeCmd *cmd = &req->cmd;
@@ -4722,6 +4777,8 @@ static uint16_t nvme_get_log(NvmeCtrl *n, NvmeRequest *req)
return nvme_changed_nslist(n, rae, len, off, req);
case NVME_LOG_CMD_EFFECTS:
return nvme_cmd_effects(n, csi, len, off, req);
+ case NVME_LOG_CHANGED_ZONE:
+ return nvme_changed_zones(n, rae, len, off, req);
default:
trace_pci_nvme_err_invalid_log_page(nvme_cid(req), lid);
return NVME_INVALID_FIELD | NVME_DNR;
diff --git a/hw/nvme/ns.c b/hw/nvme/ns.c
index b577f2d8e0..25cd490c99 100644
--- a/hw/nvme/ns.c
+++ b/hw/nvme/ns.c
@@ -240,6 +240,7 @@ static void nvme_ns_zoned_init_state(NvmeNamespace *ns)
QTAILQ_INIT(&ns->imp_open_zones);
QTAILQ_INIT(&ns->closed_zones);
QTAILQ_INIT(&ns->full_zones);
+ QTAILQ_INIT(&ns->zdc_list);
zone = ns->zone_array;
for (i = 0; i < ns->num_zones; i++, zone++) {
@@ -526,8 +527,13 @@ void nvme_ns_shutdown(NvmeNamespace *ns)
void nvme_ns_cleanup(NvmeNamespace *ns)
{
+ NvmeZdc *zdc;
+
if (ns->params.zoned) {
timer_free(ns->active_timer);
+ while ((zdc = QTAILQ_FIRST(&ns->zdc_list))) {
+ g_free(zdc);
+ }
g_free(ns->id_ns_zoned);
g_free(ns->zone_array);
g_free(ns->zd_extensions);
diff --git a/hw/nvme/nvme.h b/hw/nvme/nvme.h
index 9a54dcdb32..ae65226150 100644
--- a/hw/nvme/nvme.h
+++ b/hw/nvme/nvme.h
@@ -32,6 +32,7 @@ QEMU_BUILD_BUG_ON(NVME_MAX_NAMESPACES > NVME_NSID_BROADCAST - 1);
typedef struct NvmeCtrl NvmeCtrl;
typedef struct NvmeNamespace NvmeNamespace;
+typedef struct NvmeZone NvmeZone;
#define TYPE_NVME_BUS "nvme-bus"
OBJECT_DECLARE_SIMPLE_TYPE(NvmeBus, NVME_BUS)
@@ -90,10 +91,16 @@ static inline NvmeNamespace *nvme_subsys_ns(NvmeSubsystem *subsys,
#define NVME_NS(obj) \
OBJECT_CHECK(NvmeNamespace, (obj), TYPE_NVME_NS)
+typedef struct NvmeZdc {
+ QTAILQ_ENTRY(NvmeZdc) entry;
+ NvmeZone *zone;
+} NvmeZdc;
+
typedef struct NvmeZone {
NvmeZoneDescr d;
uint64_t w_ptr;
int64_t finish_ms;
+ NvmeZdc *zdc_entry;
QTAILQ_ENTRY(NvmeZone) entry;
} NvmeZone;
@@ -172,6 +179,7 @@ typedef struct NvmeNamespace {
int64_t fto_ms;
QEMUTimer *active_timer;
+ QTAILQ_HEAD(, NvmeZdc) zdc_list;
NvmeNamespaceParams params;
diff --git a/hw/nvme/trace-events b/hw/nvme/trace-events
index fccb79f489..337927e607 100644
--- a/hw/nvme/trace-events
+++ b/hw/nvme/trace-events
@@ -64,6 +64,7 @@ pci_nvme_identify_nslist(uint32_t ns) "nsid %"PRIu32""
pci_nvme_identify_nslist_csi(uint16_t ns, uint8_t csi) "nsid=%"PRIu16", csi=0x%"PRIx8""
pci_nvme_identify_cmd_set(void) "identify i/o command set"
pci_nvme_identify_ns_descr_list(uint32_t ns) "nsid %"PRIu32""
+pci_nvme_changed_zones(uint32_t ns) "nsid %"PRIu32""
pci_nvme_get_log(uint16_t cid, uint8_t lid, uint8_t lsp, uint8_t rae, uint32_t len, uint64_t off) "cid %"PRIu16" lid 0x%"PRIx8" lsp 0x%"PRIx8" rae 0x%"PRIx8" len %"PRIu32" off %"PRIu64""
pci_nvme_getfeat(uint16_t cid, uint32_t nsid, uint8_t fid, uint8_t sel, uint32_t cdw11) "cid %"PRIu16" nsid 0x%"PRIx32" fid 0x%"PRIx8" sel 0x%"PRIx8" cdw11 0x%"PRIx32""
pci_nvme_setfeat(uint16_t cid, uint32_t nsid, uint8_t fid, uint8_t save, uint32_t cdw11) "cid %"PRIu16" nsid 0x%"PRIx32" fid 0x%"PRIx8" save 0x%"PRIx8" cdw11 0x%"PRIx32""
diff --git a/include/block/nvme.h b/include/block/nvme.h
index 8027b7126b..c747cc4948 100644
--- a/include/block/nvme.h
+++ b/include/block/nvme.h
@@ -1010,6 +1010,7 @@ enum NvmeLogIdentifier {
NVME_LOG_FW_SLOT_INFO = 0x03,
NVME_LOG_CHANGED_NSLIST = 0x04,
NVME_LOG_CMD_EFFECTS = 0x05,
+ NVME_LOG_CHANGED_ZONE = 0xbf,
};
typedef struct QEMU_PACKED NvmePSD {
@@ -1617,6 +1618,12 @@ typedef enum NvmeVirtualResourceType {
NVME_VIRT_RES_INTERRUPT = 0x01,
} NvmeVirtualResourceType;
+typedef struct QEMU_PACKED NvmeZoneIdList {
+ uint16_t nzid;
+ uint16_t rsvd2[3];
+ uint64_t zids[511];
+} NvmeZoneIdList;
+
static inline void _nvme_check_size(void)
{
QEMU_BUILD_BUG_ON(sizeof(NvmeBar) != 4096);
@@ -1655,5 +1662,6 @@ static inline void _nvme_check_size(void)
QEMU_BUILD_BUG_ON(sizeof(NvmePriCtrlCap) != 4096);
QEMU_BUILD_BUG_ON(sizeof(NvmeSecCtrlEntry) != 32);
QEMU_BUILD_BUG_ON(sizeof(NvmeSecCtrlList) != 4096);
+ QEMU_BUILD_BUG_ON(sizeof(NvmeZoneIdList) != 4096);
}
#endif
--
2.27.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH V2 3/4] hw/block/nvme: supply dw1 for aen result
2022-10-21 23:10 [PATCH V2 0/4] hw/block/nvme: Implement ZNS finish-zone ZDC AEN clay.mayers
2022-10-21 23:10 ` [PATCH V2 1/4] hw/block/nvme: add ZONE_FINISH_RECOMMENDED functionality clay.mayers
2022-10-21 23:10 ` [PATCH V2 2/4] hw/block/nvme: add zone descriptor changed log page clay.mayers
@ 2022-10-21 23:10 ` clay.mayers
2022-10-21 23:10 ` [PATCH V2 4/4] hw/block/nvme: add zone descriptor changed AEN clay.mayers
` (2 subsequent siblings)
5 siblings, 0 replies; 8+ messages in thread
From: clay.mayers @ 2022-10-21 23:10 UTC (permalink / raw)
To: qemu-devel
Cc: Keith Busch, Klaus Jensen, Fam Zheng, Phlippe Mathieu-Daudé
From: Clay Mayers <clay.mayers@kioxia.com>
cqe.dw1 AEN is sometimes required to convey the NSID of the log page
to read. This is the case for the zone descriptor changed log
page.
Signed-off-by: Clay Mayers <clay.mayers@kioxia.com>
---
hw/nvme/ctrl.c | 19 +++++++++++--------
hw/nvme/trace-events | 2 +-
include/block/nvme.h | 4 +++-
3 files changed, 15 insertions(+), 10 deletions(-)
diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
index 5b5ceb90c2..4fb85160cc 100644
--- a/hw/nvme/ctrl.c
+++ b/hw/nvme/ctrl.c
@@ -1445,6 +1445,7 @@ static void nvme_process_aers(void *opaque)
result->event_type = event->result.event_type;
result->event_info = event->result.event_info;
result->log_page = event->result.log_page;
+ req->cqe.dw1 = cpu_to_le32(event->result.nsid);
g_free(event);
trace_pci_nvme_aer_post_cqe(result->event_type, result->event_info,
@@ -1455,11 +1456,12 @@ static void nvme_process_aers(void *opaque)
}
static void nvme_enqueue_event(NvmeCtrl *n, uint8_t event_type,
- uint8_t event_info, uint8_t log_page)
+ uint8_t event_info, uint8_t log_page,
+ uint32_t nsid)
{
NvmeAsyncEvent *event;
- trace_pci_nvme_enqueue_event(event_type, event_info, log_page);
+ trace_pci_nvme_enqueue_event(event_type, event_info, log_page, nsid);
if (n->aer_queued == n->params.aer_max_queued) {
trace_pci_nvme_enqueue_event_noqueue(n->aer_queued);
@@ -1471,6 +1473,7 @@ static void nvme_enqueue_event(NvmeCtrl *n, uint8_t event_type,
.event_type = event_type,
.event_info = event_info,
.log_page = log_page,
+ .nsid = nsid,
};
QTAILQ_INSERT_TAIL(&n->aer_queue, event, entry);
@@ -1505,7 +1508,7 @@ static void nvme_smart_event(NvmeCtrl *n, uint8_t event)
return;
}
- nvme_enqueue_event(n, NVME_AER_TYPE_SMART, aer_info, NVME_LOG_SMART_INFO);
+ nvme_enqueue_event(n, NVME_AER_TYPE_SMART, aer_info, NVME_LOG_SMART_INFO, 0);
}
static void nvme_clear_events(NvmeCtrl *n, uint8_t event_type)
@@ -5830,7 +5833,7 @@ static uint16_t nvme_ns_attachment(NvmeCtrl *n, NvmeRequest *req)
if (!test_and_set_bit(nsid, ctrl->changed_nsids)) {
nvme_enqueue_event(ctrl, NVME_AER_TYPE_NOTICE,
NVME_AER_INFO_NOTICE_NS_ATTR_CHANGED,
- NVME_LOG_CHANGED_NSLIST);
+ NVME_LOG_CHANGED_NSLIST, 0);
}
}
@@ -6971,7 +6974,7 @@ static void nvme_process_db(NvmeCtrl *n, hwaddr addr, int val)
if (n->outstanding_aers) {
nvme_enqueue_event(n, NVME_AER_TYPE_ERROR,
NVME_AER_INFO_ERR_INVALID_DB_REGISTER,
- NVME_LOG_ERROR_INFO);
+ NVME_LOG_ERROR_INFO, 0);
}
return;
@@ -6988,7 +6991,7 @@ static void nvme_process_db(NvmeCtrl *n, hwaddr addr, int val)
if (n->outstanding_aers) {
nvme_enqueue_event(n, NVME_AER_TYPE_ERROR,
NVME_AER_INFO_ERR_INVALID_DB_VALUE,
- NVME_LOG_ERROR_INFO);
+ NVME_LOG_ERROR_INFO, 0);
}
return;
@@ -7033,7 +7036,7 @@ static void nvme_process_db(NvmeCtrl *n, hwaddr addr, int val)
if (n->outstanding_aers) {
nvme_enqueue_event(n, NVME_AER_TYPE_ERROR,
NVME_AER_INFO_ERR_INVALID_DB_REGISTER,
- NVME_LOG_ERROR_INFO);
+ NVME_LOG_ERROR_INFO, 0);
}
return;
@@ -7050,7 +7053,7 @@ static void nvme_process_db(NvmeCtrl *n, hwaddr addr, int val)
if (n->outstanding_aers) {
nvme_enqueue_event(n, NVME_AER_TYPE_ERROR,
NVME_AER_INFO_ERR_INVALID_DB_VALUE,
- NVME_LOG_ERROR_INFO);
+ NVME_LOG_ERROR_INFO, 0);
}
return;
diff --git a/hw/nvme/trace-events b/hw/nvme/trace-events
index 337927e607..86c01f8762 100644
--- a/hw/nvme/trace-events
+++ b/hw/nvme/trace-events
@@ -80,7 +80,7 @@ pci_nvme_aer_masked(uint8_t type, uint8_t mask) "type 0x%"PRIx8" mask 0x%"PRIx8"
pci_nvme_aer_post_cqe(uint8_t typ, uint8_t info, uint8_t log_page) "type 0x%"PRIx8" info 0x%"PRIx8" lid 0x%"PRIx8""
pci_nvme_ns_attachment(uint16_t cid, uint8_t sel) "cid %"PRIu16", sel=0x%"PRIx8""
pci_nvme_ns_attachment_attach(uint16_t cntlid, uint32_t nsid) "cntlid=0x%"PRIx16", nsid=0x%"PRIx32""
-pci_nvme_enqueue_event(uint8_t typ, uint8_t info, uint8_t log_page) "type 0x%"PRIx8" info 0x%"PRIx8" lid 0x%"PRIx8""
+pci_nvme_enqueue_event(uint8_t typ, uint8_t info, uint8_t log_page, uint32_t nsid) "type 0x%"PRIx8" info 0x%"PRIx8" lid 0x%"PRIx8" nsid %"PRIu32""
pci_nvme_enqueue_event_noqueue(int queued) "queued %d"
pci_nvme_enqueue_event_masked(uint8_t typ) "type 0x%"PRIx8""
pci_nvme_no_outstanding_aers(void) "ignoring event; no outstanding AERs"
diff --git a/include/block/nvme.h b/include/block/nvme.h
index c747cc4948..9467d4b939 100644
--- a/include/block/nvme.h
+++ b/include/block/nvme.h
@@ -837,6 +837,7 @@ typedef struct QEMU_PACKED NvmeAerResult {
uint8_t event_info;
uint8_t log_page;
uint8_t resv;
+ uint32_t nsid;
} NvmeAerResult;
typedef struct QEMU_PACKED NvmeZonedResult {
@@ -1228,6 +1229,7 @@ enum NvmeNsAttachmentOperation {
#define NVME_AEC_SMART(aec) (aec & 0xff)
#define NVME_AEC_NS_ATTR(aec) ((aec >> 8) & 0x1)
#define NVME_AEC_FW_ACTIVATION(aec) ((aec >> 9) & 0x1)
+#define NVME_AEC_ZONE_CHANGED(aec) ((aec >> 27) & 0x1)
#define NVME_ERR_REC_TLER(err_rec) (err_rec & 0xffff)
#define NVME_ERR_REC_DULBE(err_rec) (err_rec & 0x10000)
@@ -1627,7 +1629,7 @@ typedef struct QEMU_PACKED NvmeZoneIdList {
static inline void _nvme_check_size(void)
{
QEMU_BUILD_BUG_ON(sizeof(NvmeBar) != 4096);
- QEMU_BUILD_BUG_ON(sizeof(NvmeAerResult) != 4);
+ QEMU_BUILD_BUG_ON(sizeof(NvmeAerResult) != 8);
QEMU_BUILD_BUG_ON(sizeof(NvmeZonedResult) != 8);
QEMU_BUILD_BUG_ON(sizeof(NvmeCqe) != 16);
QEMU_BUILD_BUG_ON(sizeof(NvmeDsmRange) != 16);
--
2.27.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH V2 4/4] hw/block/nvme: add zone descriptor changed AEN
2022-10-21 23:10 [PATCH V2 0/4] hw/block/nvme: Implement ZNS finish-zone ZDC AEN clay.mayers
` (2 preceding siblings ...)
2022-10-21 23:10 ` [PATCH V2 3/4] hw/block/nvme: supply dw1 for aen result clay.mayers
@ 2022-10-21 23:10 ` clay.mayers
2022-10-24 15:22 ` [PATCH V2 0/4] hw/block/nvme: Implement ZNS finish-zone ZDC AEN Clay Mayers
2022-11-09 8:40 ` Klaus Jensen
5 siblings, 0 replies; 8+ messages in thread
From: clay.mayers @ 2022-10-21 23:10 UTC (permalink / raw)
To: qemu-devel
Cc: Keith Busch, Klaus Jensen, Fam Zheng, Phlippe Mathieu-Daudé
From: Clay Mayers <clay.mayers@kioxia.com>
If a namespace's param.zoned.finish_time is non-zero,
controllers register with the namespace to be notified
when entries are added to its zone-descriptor-changed
log page. If the zone-descriptor-changed aen is enabled,
this will cause an AEN to be sent from that controller.
Signed-off-by: Clay Mayers <clay.mayers@kioxia.com>
---
hw/nvme/ctrl.c | 63 +++++++++++++++++++++++++++++++++++++++++++-
hw/nvme/ns.c | 1 +
hw/nvme/nvme.h | 11 ++++++++
include/block/nvme.h | 2 ++
4 files changed, 76 insertions(+), 1 deletion(-)
diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
index 4fb85160cc..be6147363d 100644
--- a/hw/nvme/ctrl.c
+++ b/hw/nvme/ctrl.c
@@ -1519,6 +1519,52 @@ static void nvme_clear_events(NvmeCtrl *n, uint8_t event_type)
}
}
+static void nvme_zdc_watch(NvmeCtrl *n, NvmeNamespace *ns, NvmeNotifyFnc fnc)
+{
+ NvmeZdcNotify *watcher = g_malloc0(sizeof(*watcher));
+
+ watcher->n = n;
+ watcher->notify = fnc;
+ QTAILQ_INSERT_TAIL(&ns->zdc_watchers, watcher, entry);
+}
+
+static void nvme_zdc_unwatch(NvmeCtrl *n, NvmeNamespace *ns)
+{
+ NvmeZdcNotify *watcher;
+
+ QTAILQ_FOREACH(watcher, &ns->zdc_watchers, entry) {
+ if (watcher->n == n) {
+ QTAILQ_REMOVE(&ns->zdc_watchers, watcher, entry);
+ break;
+ }
+ }
+}
+
+static void nvme_zdc_notify(NvmeNamespace *ns)
+{
+ NvmeZdcNotify *watcher;
+
+ QTAILQ_FOREACH(watcher, &ns->zdc_watchers, entry) {
+ (*watcher->notify)(watcher->n, ns);
+ }
+}
+
+static void nvme_zdc_aen(NvmeCtrl *n, NvmeNamespace *ns)
+{
+ g_assert(n->id_ctrl.oaes & (1 << 27));
+
+ if (!NVME_AEC_ZONE_CHANGED(n->features.async_config)) {
+ return;
+ }
+
+ if (!n->zdc_event_queued) {
+ n->zdc_event_queued = true;
+ nvme_enqueue_event(n, NVME_AER_TYPE_NOTICE,
+ NVME_AER_INFO_NOTICE_ZONE_DESC_CHANGED,
+ NVME_LOG_CHANGED_ZONE, ns->params.nsid);
+ }
+}
+
static void nvme_zdc_list(NvmeNamespace *ns, NvmeZoneIdList *zlist, bool reset)
{
NvmeZdc *zdc;
@@ -1554,6 +1600,7 @@ static void nvme_check_finish(NvmeNamespace *ns, NvmeZoneListHead *list)
zdc->zone = zone;
zone->zdc_entry = zdc;
QTAILQ_INSERT_TAIL(&ns->zdc_list, zdc, entry);
+ nvme_zdc_notify(ns);
}
} else if (zone->finish_ms != INT64_MAX) {
timer_mod_anticipate(ns->active_timer, zone->finish_ms);
@@ -4727,6 +4774,14 @@ static uint16_t nvme_changed_zones(NvmeCtrl *n, uint8_t rae, uint32_t buf_len,
return NVME_INVALID_NSID | NVME_DNR;
}
nvme_zdc_list(ns, &zlist, !rae);
+ if (!rae) {
+ n->zdc_event_queued = false;
+ nvme_clear_events(n, NVME_AER_TYPE_NOTICE);
+ /* send a new aen if there are still zdc entries */
+ if (!QTAILQ_EMPTY(&ns->zdc_list)) {
+ nvme_zdc_notify(ns);
+ }
+ }
trans_len = MIN(sizeof(zlist) - off, buf_len);
@@ -5815,6 +5870,7 @@ static uint16_t nvme_ns_attachment(NvmeCtrl *n, NvmeRequest *req)
return NVME_NS_NOT_ATTACHED | NVME_DNR;
}
+ nvme_zdc_unwatch(n, ns);
ctrl->namespaces[nsid] = NULL;
ns->attached--;
@@ -7542,7 +7598,7 @@ static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice *pci_dev)
id->cntlid = cpu_to_le16(n->cntlid);
- id->oaes = cpu_to_le32(NVME_OAES_NS_ATTR);
+ id->oaes = cpu_to_le32(NVME_OAES_NS_ATTR | NVME_OAES_ZDC);
id->ctratt |= cpu_to_le32(NVME_CTRATT_ELBAS);
id->rab = 6;
@@ -7659,6 +7715,10 @@ void nvme_attach_ns(NvmeCtrl *n, NvmeNamespace *ns)
n->dmrsl = MIN_NON_ZERO(n->dmrsl,
BDRV_REQUEST_MAX_BYTES / nvme_l2b(ns, 1));
+
+ if (ns->params.fto) {
+ nvme_zdc_watch(n, ns, nvme_zdc_aen);
+ }
}
static void nvme_realize(PCIDevice *pci_dev, Error **errp)
@@ -7721,6 +7781,7 @@ static void nvme_exit(PCIDevice *pci_dev)
for (i = 1; i <= NVME_MAX_NAMESPACES; i++) {
ns = nvme_ns(n, i);
if (ns) {
+ nvme_zdc_unwatch(n, ns);
ns->attached--;
}
}
diff --git a/hw/nvme/ns.c b/hw/nvme/ns.c
index 25cd490c99..5629b61302 100644
--- a/hw/nvme/ns.c
+++ b/hw/nvme/ns.c
@@ -241,6 +241,7 @@ static void nvme_ns_zoned_init_state(NvmeNamespace *ns)
QTAILQ_INIT(&ns->closed_zones);
QTAILQ_INIT(&ns->full_zones);
QTAILQ_INIT(&ns->zdc_list);
+ QTAILQ_INIT(&ns->zdc_watchers);
zone = ns->zone_array;
for (i = 0; i < ns->num_zones; i++, zone++) {
diff --git a/hw/nvme/nvme.h b/hw/nvme/nvme.h
index ae65226150..5499105e7b 100644
--- a/hw/nvme/nvme.h
+++ b/hw/nvme/nvme.h
@@ -91,6 +91,14 @@ static inline NvmeNamespace *nvme_subsys_ns(NvmeSubsystem *subsys,
#define NVME_NS(obj) \
OBJECT_CHECK(NvmeNamespace, (obj), TYPE_NVME_NS)
+typedef void (*NvmeNotifyFnc)(NvmeCtrl *n, NvmeNamespace *ns);
+
+typedef struct NvmeZdcNotify {
+ QTAILQ_ENTRY(NvmeZdcNotify) entry;
+ NvmeNotifyFnc notify;
+ NvmeCtrl *n;
+} NvmeZdcNotify;
+
typedef struct NvmeZdc {
QTAILQ_ENTRY(NvmeZdc) entry;
NvmeZone *zone;
@@ -179,6 +187,7 @@ typedef struct NvmeNamespace {
int64_t fto_ms;
QEMUTimer *active_timer;
+ QTAILQ_HEAD(, NvmeZdcNotify) zdc_watchers;
QTAILQ_HEAD(, NvmeZdc) zdc_list;
NvmeNamespaceParams params;
@@ -477,6 +486,8 @@ typedef struct NvmeCtrl {
uint64_t dbbuf_eis;
bool dbbuf_enabled;
+ bool zdc_event_queued;
+
struct {
MemoryRegion mem;
uint8_t *buf;
diff --git a/include/block/nvme.h b/include/block/nvme.h
index 9467d4b939..1662046c0d 100644
--- a/include/block/nvme.h
+++ b/include/block/nvme.h
@@ -830,6 +830,7 @@ enum NvmeAsyncEventRequest {
NVME_AER_INFO_SMART_TEMP_THRESH = 1,
NVME_AER_INFO_SMART_SPARE_THRESH = 2,
NVME_AER_INFO_NOTICE_NS_ATTR_CHANGED = 0,
+ NVME_AER_INFO_NOTICE_ZONE_DESC_CHANGED = 0xef,
};
typedef struct QEMU_PACKED NvmeAerResult {
@@ -1133,6 +1134,7 @@ typedef struct NvmeIdCtrlNvm {
enum NvmeIdCtrlOaes {
NVME_OAES_NS_ATTR = 1 << 8,
+ NVME_OAES_ZDC = 1 << 27,
};
enum NvmeIdCtrlCtratt {
--
2.27.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* RE: [PATCH V2 0/4] hw/block/nvme: Implement ZNS finish-zone ZDC AEN
2022-10-21 23:10 [PATCH V2 0/4] hw/block/nvme: Implement ZNS finish-zone ZDC AEN clay.mayers
` (3 preceding siblings ...)
2022-10-21 23:10 ` [PATCH V2 4/4] hw/block/nvme: add zone descriptor changed AEN clay.mayers
@ 2022-10-24 15:22 ` Clay Mayers
2022-11-09 8:40 ` Klaus Jensen
5 siblings, 0 replies; 8+ messages in thread
From: Clay Mayers @ 2022-10-24 15:22 UTC (permalink / raw)
To: qemu-devel@nongnu.org
Cc: Keith Busch, Klaus Jensen, Fam Zheng, Phlippe Mathieu-Daudé,
Dmitry Fomichev
> From: clay.mayers@kioxia.com
> Sent: Friday, October 21, 2022 4:11 PM
> To: qemu-devel@nongnu.org
> Cc: Keith Busch <kbusch@kernel.org>; Klaus Jensen <its@irrelevant.dk>; Fam
> Zheng <fam@euphon.net>; Phlippe Mathieu-Daudé <f4bug@amsat.org>
> Subject: [PATCH V2 0/4] hw/block/nvme: Implement ZNS finish-zone ZDC AEN
>
> From: Clay Mayers <clay.mayers@kioxia.com>
>
> ZNS controllers have the option to limit the time a zone can remain in
> the active state. It begins with a background process in the controller
> setting the finish-zone-recommended FZR attribute for a zone. As part of
> setting this attribute, the zone's id is added to the namespace's
> zone-descriptor-changed (ZDC) log page. If enabled, items added to the
> ZDC log page generate a ZDC "asynchronous event notification" AEN.
> Optionally,
> the control can induce a "zone excursion" forcing the zone into the finished
> state that also generates a ZDC event.
>
> Zone enabled applications need to properly handle ZDC events. In a real
> device,
> the timeout is many hours making testing an application difficult.
> Implemented is the generation of FZR ZDC events to speed up O/S and
> application
> testing.
>
> Added to the zoned NVMe command set is an optional, per-namespace timer
> (zoned.finish_time) to set the FZR attr for long-lived active zones; A per
> namespace ZDC log page; AEN results to including CQE.DW1 (the NSID of the
> ZDC
> AEN) and generating a ZDC AEN if it's been enabled. Zone excursions are not
> modeled.
>
> See section 5.5 of the NVMe Zoned Namespace Command Set Specification
> v1.1
> for more details.
>
> Changes since v1
> - Fixed offset length checking in zdc log page
> - Moved zdc_event_queued to the patch 4
> - Unwatched zdc events in nvme_exit()
>
> Clay Mayers (4):
> hw/block/nvme: add ZONE_FINISH_RECOMMENDED functionality
> hw/block/nvme: add zone descriptor changed log page
> hw/block/nvme: supply dw1 for aen result
> hw/block/nvme: add zone descriptor changed AEN
>
> docs/system/devices/nvme.rst | 5 +
> hw/nvme/ctrl.c | 174 +++++++++++++++++++++++++++++++++--
> hw/nvme/ns.c | 15 +++
> hw/nvme/nvme.h | 37 +++++++-
> hw/nvme/trace-events | 3 +-
> include/block/nvme.h | 14 ++-
> 6 files changed, 233 insertions(+), 15 deletions(-)
>
> --
> 2.27.0
I forgot to add Dmitry to the CC list for v2
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH V2 0/4] hw/block/nvme: Implement ZNS finish-zone ZDC AEN
2022-10-21 23:10 [PATCH V2 0/4] hw/block/nvme: Implement ZNS finish-zone ZDC AEN clay.mayers
` (4 preceding siblings ...)
2022-10-24 15:22 ` [PATCH V2 0/4] hw/block/nvme: Implement ZNS finish-zone ZDC AEN Clay Mayers
@ 2022-11-09 8:40 ` Klaus Jensen
2022-11-23 1:03 ` Clay Mayers
5 siblings, 1 reply; 8+ messages in thread
From: Klaus Jensen @ 2022-11-09 8:40 UTC (permalink / raw)
To: clay.mayers
Cc: qemu-devel, Keith Busch, Fam Zheng, Phlippe Mathieu-Daudé,
Dmitry Fomichev
[-- Attachment #1: Type: text/plain, Size: 2231 bytes --]
On Oct 21 16:10, clay.mayers@kioxia.com wrote:
> From: Clay Mayers <clay.mayers@kioxia.com>
>
> ZNS controllers have the option to limit the time a zone can remain in
> the active state. It begins with a background process in the controller
> setting the finish-zone-recommended FZR attribute for a zone. As part of
> setting this attribute, the zone's id is added to the namespace's
> zone-descriptor-changed (ZDC) log page. If enabled, items added to the
> ZDC log page generate a ZDC "asynchronous event notification" AEN. Optionally,
> the control can induce a "zone excursion" forcing the zone into the finished
> state that also generates a ZDC event.
>
> Zone enabled applications need to properly handle ZDC events. In a real device,
> the timeout is many hours making testing an application difficult.
> Implemented is the generation of FZR ZDC events to speed up O/S and application
> testing.
>
> Added to the zoned NVMe command set is an optional, per-namespace timer
> (zoned.finish_time) to set the FZR attr for long-lived active zones; A per
> namespace ZDC log page; AEN results to including CQE.DW1 (the NSID of the ZDC
> AEN) and generating a ZDC AEN if it's been enabled. Zone excursions are not
> modeled.
>
> See section 5.5 of the NVMe Zoned Namespace Command Set Specification v1.1
> for more details.
>
> Changes since v1
> - Fixed offset length checking in zdc log page
> - Moved zdc_event_queued to the patch 4
> - Unwatched zdc events in nvme_exit()
>
> Clay Mayers (4):
> hw/block/nvme: add ZONE_FINISH_RECOMMENDED functionality
> hw/block/nvme: add zone descriptor changed log page
> hw/block/nvme: supply dw1 for aen result
> hw/block/nvme: add zone descriptor changed AEN
>
> docs/system/devices/nvme.rst | 5 +
> hw/nvme/ctrl.c | 174 +++++++++++++++++++++++++++++++++--
> hw/nvme/ns.c | 15 +++
> hw/nvme/nvme.h | 37 +++++++-
> hw/nvme/trace-events | 3 +-
> include/block/nvme.h | 14 ++-
> 6 files changed, 233 insertions(+), 15 deletions(-)
>
> --
> 2.27.0
>
Nice work Clay!
Series looks good to me,
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: [PATCH V2 0/4] hw/block/nvme: Implement ZNS finish-zone ZDC AEN
2022-11-09 8:40 ` Klaus Jensen
@ 2022-11-23 1:03 ` Clay Mayers
0 siblings, 0 replies; 8+ messages in thread
From: Clay Mayers @ 2022-11-23 1:03 UTC (permalink / raw)
To: Klaus Jensen
Cc: qemu-devel@nongnu.org, Keith Busch, Fam Zheng,
Phlippe Mathieu-Daudé, Dmitry Fomichev
> From: Klaus Jensen <its@irrelevant.dk>
> Sent: Wednesday, November 9, 2022 12:40 AM
>
> On Oct 21 16:10, clay.mayers@kioxia.com wrote:
> > From: Clay Mayers <clay.mayers@kioxia.com>
> >
> > ZNS controllers have the option to limit the time a zone can remain in
> > the active state. It begins with a background process in the controller
> > setting the finish-zone-recommended FZR attribute for a zone. As part of
> > setting this attribute, the zone's id is added to the namespace's
> > zone-descriptor-changed (ZDC) log page. If enabled, items added to the
> > ZDC log page generate a ZDC "asynchronous event notification" AEN.
> Optionally,
> > the control can induce a "zone excursion" forcing the zone into the finished
> > state that also generates a ZDC event.
> >
> > Zone enabled applications need to properly handle ZDC events. In a real
> device,
> > the timeout is many hours making testing an application difficult.
> > Implemented is the generation of FZR ZDC events to speed up O/S and
> application
> > testing.
> >
> > Added to the zoned NVMe command set is an optional, per-namespace timer
> > (zoned.finish_time) to set the FZR attr for long-lived active zones; A per
> > namespace ZDC log page; AEN results to including CQE.DW1 (the NSID of the
> ZDC
> > AEN) and generating a ZDC AEN if it's been enabled. Zone excursions are not
> > modeled.
> >
> > See section 5.5 of the NVMe Zoned Namespace Command Set Specification
> v1.1
> > for more details.
> >
> > Changes since v1
> > - Fixed offset length checking in zdc log page
> > - Moved zdc_event_queued to the patch 4
> > - Unwatched zdc events in nvme_exit()
> >
> > Clay Mayers (4):
> > hw/block/nvme: add ZONE_FINISH_RECOMMENDED functionality
> > hw/block/nvme: add zone descriptor changed log page
> > hw/block/nvme: supply dw1 for aen result
> > hw/block/nvme: add zone descriptor changed AEN
> >
> > docs/system/devices/nvme.rst | 5 +
> > hw/nvme/ctrl.c | 174 +++++++++++++++++++++++++++++++++--
> > hw/nvme/ns.c | 15 +++
> > hw/nvme/nvme.h | 37 +++++++-
> > hw/nvme/trace-events | 3 +-
> > include/block/nvme.h | 14 ++-
> > 6 files changed, 233 insertions(+), 15 deletions(-)
> >
> > --
> > 2.27.0
> >
>
> Nice work Clay!
Thanks! I think you had implemented excursions at one point and I'm
not sure why it didn't make it in. I originally left out excursions because
all I needed was the AEN to test my Linux patch. I have a V3 in the works
with excursions so I can test what happens to libzbd applications. I can't
tell if there is real interest in this but I've tested rocksDB using zenfs and
it has difficulties with zone excursions. What I don't know is if excursions
are a corner case or not - in my world, they are not.
>
> Series looks good to me,
>
> Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2022-11-23 1:10 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-10-21 23:10 [PATCH V2 0/4] hw/block/nvme: Implement ZNS finish-zone ZDC AEN clay.mayers
2022-10-21 23:10 ` [PATCH V2 1/4] hw/block/nvme: add ZONE_FINISH_RECOMMENDED functionality clay.mayers
2022-10-21 23:10 ` [PATCH V2 2/4] hw/block/nvme: add zone descriptor changed log page clay.mayers
2022-10-21 23:10 ` [PATCH V2 3/4] hw/block/nvme: supply dw1 for aen result clay.mayers
2022-10-21 23:10 ` [PATCH V2 4/4] hw/block/nvme: add zone descriptor changed AEN clay.mayers
2022-10-24 15:22 ` [PATCH V2 0/4] hw/block/nvme: Implement ZNS finish-zone ZDC AEN Clay Mayers
2022-11-09 8:40 ` Klaus Jensen
2022-11-23 1:03 ` Clay Mayers
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).