From: Dmitry Bogdanov <d.bogdanov@yadro.com>
To: Guixin Liu <kanie@linux.alibaba.com>
Cc: <hch@lst.de>, <sagi@grimberg.me>, <kch@nvidia.com>,
<linux-nvme@lists.infradead.org>
Subject: Re: [PATCH v11 1/1] nvmet: support reservation feature
Date: Mon, 7 Oct 2024 11:08:53 +0300 [thread overview]
Message-ID: <20241007080853.GJ22571@yadro.com> (raw)
In-Reply-To: <b5059469-d908-41e2-811d-8e4565d0b1dd@linux.alibaba.com>
On Sat, Oct 05, 2024 at 10:26:58PM +0800, Guixin Liu wrote:
> 在 2024/9/30 20:53, Dmitry Bogdanov 写道:
> > On Sun, Sep 29, 2024 at 11:14:10AM +0800, Guixin Liu wrote:
> > > This patch implements the reservation feature, includes:
> > > 1. reservation register(register, unregister and replace).
> > > 2. reservation acquire(acquire, preempt, preempt and abort).
> > > 3. reservation release(release and clear).
> > > 4. reservation report.
> > > 5. set feature and get feature of reservation notify mask.
> > > 6. get log page of reservation event.
> > >
> > > And also make reservation configurable, one can set ns to support
> > > reservation before enable ns. The default of resv_enable is false.
> > >
> > > Signed-off-by: Guixin Liu <kanie@linux.alibaba.com>
> > > ---
> > > +
> > > +static void nvmet_execute_pr_report(struct nvmet_req *req)
> > > +{
> > > + u32 cdw11 = le32_to_cpu(req->cmd->common.cdw11);
> > > + u32 cdw10 = le32_to_cpu(req->cmd->common.cdw10);
> > > + u32 num_bytes = 4 * (cdw10 + 1); /* cdw10 is number of dwords */
> > > + u8 eds = cdw11 & 1; /* Extended data structure, bit 00 */
> > > + struct nvme_registered_ctrl_ext *ctrl_eds;
> > > + struct nvme_reservation_status_ext *data;
> > > + struct nvmet_pr *pr = &req->ns->pr;
> > > + struct nvmet_pr_registrant *holder;
> > > + struct nvmet_pr_registrant *reg;
> > > + u16 num_ctrls = 0;
> > > + u16 status;
> > > + u8 rtype;
> > > +
> > > + /* nvmet hostid(uuid_t) is 128 bit. */
> > > + if (!eds) {
> > > + req->error_loc = offsetof(struct nvme_common_command, cdw11);
> > > + status = NVME_SC_HOST_ID_INCONSIST | NVME_SC_DNR;
> > > + goto out;
> > > + }
> > > +
> > > + if (num_bytes < sizeof(struct nvme_reservation_status_ext)) {
> > > + req->error_loc = offsetof(struct nvme_common_command, cdw10);
> > > + status = NVME_SC_INVALID_FIELD | NVME_SC_DNR;
> > > + goto out;
> > > + }
> > > +
> > > + data = kmalloc(num_bytes, GFP_KERNEL);
> > > + if (!data) {
> > > + status = NVME_SC_INTERNAL;
> > > + goto out;
> > > + }
> > > + memset(data, 0, num_bytes);
> > > + data->gen = cpu_to_le32(atomic_read(&pr->generation));
> > > + data->ptpls = 0;
> > > + ctrl_eds = data->regctl_eds;
> > > +
> > > + rcu_read_lock();
> > > + holder = rcu_dereference(pr->holder);
> > > + rtype = holder ? holder->rtype : 0;
> > > + data->rtype = rtype;
> > > +
> > > + list_for_each_entry_rcu(reg, &pr->registrant_list, entry) {
> > > + if ((void *)ctrl_eds >= (void *)(data + num_bytes))
> > > + break;
> > > + /*
> > > + * Dynamic controller, set cntlid to 0xffff.
> > > + */
> > > + ctrl_eds->cntlid = 0xffff;
> > > + if (rtype == NVME_PR_WRITE_EXCLUSIVE_ALL_REGS ||
> > > + rtype == NVME_PR_EXCLUSIVE_ACCESS_ALL_REGS)
> > > + ctrl_eds->rcsts = 1;
> > > + if (holder && uuid_equal(®->hostid, &holder->hostid))
> > is just (reg == holder) not enough here?
> Yes.
> > > + ctrl_eds->rcsts = 1;
> > > + uuid_copy((uuid_t *)&ctrl_eds->hostid, ®->hostid);
> > > + ctrl_eds->rkey = cpu_to_le64(reg->rkey);
> > > + ctrl_eds++;
> > > + num_ctrls++;
> > > + }
> > > + rcu_read_unlock();
> > > +
> > > + put_unaligned_le16(num_ctrls, data->regctl);
> > Here you should report the number of all registered registrants in ns,
> > not the only reported. It is used by the host to understand that it got
> > not a full response.
> > NUMD field (cdw11) is about transfering the response. But the structure itsef in
> > the reposnse has to be complete.
>
> The NVMe spec says that:
>
> "This indicates the number of Registered Controller data structures
> and/or Registered Controller extended data structures contained in this
> data structure."
You read not the latest Spec again. Spec v2.1 has another statement:
Number of Registrants (REGSTRNT): This field indicates the number of registrants of the
namespace. This indicates the number of Registrant data structures or Registrant Extended data
structures contained in this data structure.
>
> It means that host use this filed to obtain the reported registrants number.
>
Think from the host point of view - how are you going to understand
how much registrants are there in the namespace?
BR,
Dmitry
next prev parent reply other threads:[~2024-10-07 9:30 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-29 3:14 [PATCH v11 0/1] Implement the NVMe reservation feature Guixin Liu
2024-09-29 3:14 ` [PATCH v11 1/1] nvmet: support " Guixin Liu
2024-09-30 12:53 ` Dmitry Bogdanov
2024-10-05 14:26 ` Guixin Liu
2024-10-07 8:08 ` Dmitry Bogdanov [this message]
2024-10-08 7:27 ` Guixin Liu
2024-10-08 8:45 ` Dmitry Bogdanov
2024-10-08 9:25 ` Guixin Liu
2024-10-02 8:09 ` Christoph Hellwig
2024-10-05 15:09 ` Guixin Liu
2024-10-07 6:36 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241007080853.GJ22571@yadro.com \
--to=d.bogdanov@yadro.com \
--cc=hch@lst.de \
--cc=kanie@linux.alibaba.com \
--cc=kch@nvidia.com \
--cc=linux-nvme@lists.infradead.org \
--cc=sagi@grimberg.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.