public inbox for linux-nvme@lists.infradead.org
 help / color / mirror / Atom feed
From: Sagi Grimberg <sagi@grimberg.me>
To: Guixin Liu <kanie@linux.alibaba.com>, Keith Busch <kbusch@kernel.org>
Cc: hch@lst.de, kch@nvidia.com, linux-nvme@lists.infradead.org
Subject: Re: [PATCH v7 1/1] nvmet: support reservation feature
Date: Fri, 8 Mar 2024 12:07:52 +0200	[thread overview]
Message-ID: <d6073345-ceef-49b2-834e-618bbbb78de3@grimberg.me> (raw)
In-Reply-To: <0c33b803-baff-45af-90bb-623822f756b8@linux.alibaba.com>



On 08/03/2024 11:15, Guixin Liu wrote:
>
>> unlike abort, preempt-and-abort needs a semantic guarantee because 
>> the consumer
>> may rely on this for fencing purposes. So it cannot be supported in 
>> "best effort" I think.
>>
>> A possible implementation would be not to abort as there is no such 
>> interface, but
>> nvmet may wait for all pending ns IO to complete and disallowing new 
>> IO to come in
>> (using percpu_ref_kill and percpu_ref_resurrect on ns->ref). This 
>> won't work very efficiently
>> withALL_REGS reservations though.
>
> Hi Sagi,
>
> I found that if we return an error when the call to 
> percpu_ref_tryget_live(&ns->ref) fails,
>
> it might cause hosts that still have permissions to interrupt their 
> IO. Additionally,
>
> preempt_and_abort itself holds an ns->ref, we cannot wait the ref to 
> become to zero.
>
> The solution I can think of is to add a "per-namespace" percpu_ref to 
> the controller for
>
> counting IO issued to a particular namespace by that controller. Then, 
> during the execution
>
> of preempt_and_abort, we wait for the count of those preempted and 
> unregistered controllers
>
> to drop to zero.

Yes, that is what I had in mind as well. Obviously the ns->ref cannot be 
used for this purpose.

>
> The nsid is user-specified, so we can not use array to store the 
> per-namespace percpu_ref,
>
> this will increase lookup overhead if we use xarray.

Yes, that is tricky to get right.

>
> What do you think Sagi? Or may be we can declare that 
> preempt_and_abort is not supported, just
>
> like SPDK does.

It can definitely come incrementally, but at the very least it should be 
incorrectly supported.

Out of curiosity, doesn't your use-case need a fencing protection 
against inflight I/Os reordering during
preemption?


  reply	other threads:[~2024-03-08 10:08 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-01  2:32 [PATCH v7 0/1] Implement the NVMe reservation feature Guixin Liu
2024-02-01  2:32 ` [PATCH v7 1/1] nvmet: support " Guixin Liu
2024-02-28  0:40   ` Sagi Grimberg
2024-02-28  2:21     ` Guixin Liu
2024-02-28  3:21       ` Keith Busch
2024-02-28  3:40         ` Guixin Liu
2024-03-07  9:27           ` Sagi Grimberg
2024-03-07  9:42             ` Guixin Liu
2024-03-08  9:15             ` Guixin Liu
2024-03-08 10:07               ` Sagi Grimberg [this message]
2024-03-11 11:19                 ` Guixin Liu
2024-03-12 21:31                   ` Sagi Grimberg
2024-03-13  3:42                     ` Guixin Liu
2024-03-13  9:54                       ` Sagi Grimberg
2024-03-13 11:56                         ` Guixin Liu
2024-03-13 12:36                           ` Sagi Grimberg
2024-03-14  2:03                             ` Guixin Liu
2024-03-19  2:59                               ` Chaitanya Kulkarni
2024-03-19  3:19                                 ` Guixin Liu
2024-03-20  1:59                                   ` hch
2024-03-20  9:16                                     ` Sagi Grimberg
2024-03-21  8:06                                       ` Chaitanya Kulkarni
2024-03-21 21:02                                       ` hch
2024-03-22  9:34                                         ` Chaitanya Kulkarni
2024-03-23 20:41                                           ` Sagi Grimberg
2024-02-18  2:12 ` [PATCH v7 0/1] Implement the NVMe " Guixin Liu
2024-02-26  6:33   ` Guixin Liu
2024-02-26  6:43     ` Chaitanya Kulkarni
2024-02-29  2:57 ` Guixin Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d6073345-ceef-49b2-834e-618bbbb78de3@grimberg.me \
    --to=sagi@grimberg.me \
    --cc=hch@lst.de \
    --cc=kanie@linux.alibaba.com \
    --cc=kbusch@kernel.org \
    --cc=kch@nvidia.com \
    --cc=linux-nvme@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox