public inbox for linux-nvme@lists.infradead.org
 help / color / mirror / Atom feed
From: Chaitanya Kulkarni <chaitanyak@nvidia.com>
To: Taehee Yoo <ap420073@gmail.com>
Cc: "james.p.freyensee@intel.com" <james.p.freyensee@intel.com>,
	"hch@lst.de" <hch@lst.de>,
	"kbusch@kernel.org" <kbusch@kernel.org>,
	"linux-nvme@lists.infradead.org" <linux-nvme@lists.infradead.org>,
	"ming.l@ssi.samsung.com" <ming.l@ssi.samsung.com>,
	"larrystevenwise@gmail.com" <larrystevenwise@gmail.com>,
	"anthony.j.knapp@intel.com" <anthony.j.knapp@intel.com>,
	"pizhenwei@bytedance.com" <pizhenwei@bytedance.com>,
	Sagi Grimberg <sagi@grimberg.me>, "axboe@fb.com" <axboe@fb.com>,
	Chaitanya Kulkarni <chaitanyak@nvidia.com>
Subject: Re: [PATCH 1/4] nvme: fix delete uninitialized controller
Date: Wed, 4 Jan 2023 00:24:25 +0000	[thread overview]
Message-ID: <af53be64-4ed3-96ea-7700-ac8f0cc8de6b@nvidia.com> (raw)
In-Reply-To: <b6b16a5a-d592-eb12-ea30-1535b80d60dc@grimberg.me>

On 1/3/23 02:30, Sagi Grimberg wrote:
>> nvme-fabric controllers can be deleted by
>> /sys/class/nvme/nvme<NS>/delete_controller
>> echo 1 > /sys/class/nvme/nvme<NS>/delete_controller
>> The above command will call nvme_delete_ctrl_sync().
>> This function internally tries to change ctrl->state to 
>> NVME_CTRL_DELETING.
>> NVME_CTRL_LIVE, NVME_CTRL_RESETTING, and NVME_CTRL_CONNECTING states can
>> be changed to NVME_CTRL_DELETING.
>> If the state is successfully changed, nvme_do_delete_ctrl() is called,
>> which is the actual delete logic of controller.
>>
>> controller initialization logic changes ctrl->state.
>> NEW -> CONNECTING -> LIVE.
>> NVME_CTRL_CONNECTING state doesn't ensure that initialization is done.
>>
>> So, delete logic can be called before the finish of controller
>> initialization.
>> So kernel panic would occur because nvme_do_delete_ctrl() dereferences
>> uninitialized values.

thanks for discovering this, do you perhaps have sequence of commands to
reproduce this ?

[...]

>> +++ b/drivers/nvme/host/core.c
>> @@ -243,7 +243,8 @@ static void nvme_delete_ctrl_sync(struct nvme_ctrl 
>> *ctrl)
>>        * since ->delete_ctrl can free the controller.
>>        */
>>       nvme_get_ctrl(ctrl);
>> -    if (nvme_change_ctrl_state(ctrl, NVME_CTRL_DELETING))
>> +    if (test_bit(NVME_CTRL_STARTED_ONCE, &ctrl->flags) &&
>> +        nvme_change_ctrl_state(ctrl, NVME_CTRL_DELETING))
>>           nvme_do_delete_ctrl(ctrl);
> 
> So what is the outcome now? if the controller kept on dangling? what
> triggers the controller deletion?
> 
>>       nvme_put_ctrl(ctrl);
>>   }
> 
> I don't think this is the correct approach.
> the delete should fully fence the initialization and then delete
> the controller.
> 
> In this case, the transport driver should not quiesce a non-existent
> queue.
> 
> If further synchronization is needed, then it should be added so that
> delete will fully fence the initialization.

as stated here I'd add complete fencing for the initialization and
delete transition ..

-ck


  reply	other threads:[~2023-01-04  0:24 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-03 10:03 [PATCH 0/4] nvme: fix several bugs in nvme-fabric Taehee Yoo
2023-01-03 10:03 ` [PATCH 1/4] nvme: fix delete uninitialized controller Taehee Yoo
2023-01-03 10:30   ` Sagi Grimberg
2023-01-04  0:24     ` Chaitanya Kulkarni [this message]
2023-01-04  2:42       ` Taehee Yoo
2023-01-03 10:03 ` [PATCH 2/4] nvme: fix reset " Taehee Yoo
2023-01-03 10:32   ` Sagi Grimberg
2023-01-03 10:03 ` [PATCH 3/4] nvmet: fix hang in nvmet_ns_disable() Taehee Yoo
2023-01-03 10:58   ` Sagi Grimberg
2023-01-04  0:32   ` Chaitanya Kulkarni
2023-01-04  8:56     ` Taehee Yoo
2023-01-03 10:03 ` [PATCH 4/4] nvmet-tcp: fix memory leak in nvmet_tcp_free_cmd_data_in_buffers() Taehee Yoo
2023-01-03 10:54   ` Sagi Grimberg
2023-01-04  8:44     ` Taehee Yoo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=af53be64-4ed3-96ea-7700-ac8f0cc8de6b@nvidia.com \
    --to=chaitanyak@nvidia.com \
    --cc=anthony.j.knapp@intel.com \
    --cc=ap420073@gmail.com \
    --cc=axboe@fb.com \
    --cc=hch@lst.de \
    --cc=james.p.freyensee@intel.com \
    --cc=kbusch@kernel.org \
    --cc=larrystevenwise@gmail.com \
    --cc=linux-nvme@lists.infradead.org \
    --cc=ming.l@ssi.samsung.com \
    --cc=pizhenwei@bytedance.com \
    --cc=sagi@grimberg.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox