From: Sagi Grimberg <sagi@grimberg.me>
To: James Smart <james.smart@broadcom.com>, linux-nvme@lists.infradead.org
Cc: Keith Busch <kbusch@kernel.org>,
Israel Rukshin <israelr@mellanox.com>,
Christoph Hellwig <hch@lst.de>,
stable@vger.kernel.org, Max Gurtovoy <maxg@mellanox.com>
Subject: Re: [PATCH] nvme: Revert: Fix controller creation races with teardown flow
Date: Tue, 1 Sep 2020 15:01:19 -0700 [thread overview]
Message-ID: <7e78b2e6-d103-23a1-a9ab-d12336a9c089@grimberg.me> (raw)
In-Reply-To: <f7d0c08f-2a34-fffa-7962-d0641bc579fd@broadcom.com>
>>>>>> This is indeed a regression.
>>>>>>
>>>>>> Perhaps we should also revert:
>>>>>> 12a0b6622107 ("nvme: don't hold nvmf_transports_rwsem for more
>>>>>> than transport lookups")
>>>>>>
>>>>>> Which inherently caused this by removing the serialization of
>>>>>> .create_ctrl()...
>>>>>
>>>>> no, I believe the patch on the semaphore is correct. Otherwise -
>>>>> things can be blocked a long time.. a minute (1 cmd timeout) or
>>>>> even multiple minutes in the case where a command failure in core
>>>>> layers effectively gets ignored and thus doesn't cause the error
>>>>> path in the transport. There can be multiple /dev/nvme-fabrics
>>>>> commands stacked up that can make the delays look much longer to
>>>>> the last guy.
>>>>>
>>>>> as far as creation vs teardown... yeah, not fun, but there are
>>>>> other ways to deal with it. FC: I got rid of the separate
>>>>> create/reconnect threads a while ago thus the
>>>>> return-control-while-reconnecting behavior, so I've had to deal
>>>>> with it. It's one area it'd be nice to see some convergence in
>>>>> implementation again between transports.
>>>>
>>>> Doesn't fc have a bug there? in create_ctrl after flushing the
>>>> connect_work, what is telling it if delete is running in with it
>>>> (or that it already ran...)
>>>
>>> I guess I don't understand what the bug is you are thinking about.
>>> Maybe there's a short period that the ctrl ptr is perhaps freed, thus
>>> the pointer shouldn't be used - but I don't see it as almost
>>> everything is simply looking at the value of the pointer, not
>>> dereferencing it.
>>
>> I'm referring to nvme_fc_init_ctrl, if delete happens while it
>> is waiting in flush_delayed_work(&ctrl->connect_work); won't you
>> dereference and return a controller that is possibly already
>> deleted/freed?
>
> ok - that matches my "short period" and it is possible as there's one
> immediate printf that may dereference the ptr. After that, it's
> comparisons of the pointer value. I can move the printf to avoid the
> issue. That window's rather small.
But you also return back &ctrl->ctrl, that is another dereference, and
what will make ctrl to be an ERR_PTR?
Anyway, we should probably come up with something more robust...
>>> I do have a bug or two with delete association fighting with
>>> create_association - but it's mainly due to nvme_fc_error_recovery
>>> not the delete routine. I've reworked this area after seeing your
>>> other patches and will be posting after some more testing. But no
>>> reason for synchronizing all ctrl creates.
>>
>> Is it that big of an issue? it should fail rather quickly shouldn't it?
>
> not sure what you are asking. if it's how long to fail the creation of
> a new association - it's at least 60s (an admin command timeout).
That's the worst case (admin command timeout), but is it the most common
case?
Would making the timeouts shorter in the initial connect make sense?
Just throwing out ideas...
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
next prev parent reply other threads:[~2020-09-01 22:01 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-08-28 19:01 [PATCH] nvme: Revert: Fix controller creation races with teardown flow James Smart
2020-08-28 19:08 ` Sagi Grimberg
2020-08-28 20:24 ` James Smart
2020-08-28 23:59 ` Sagi Grimberg
2020-08-31 22:26 ` James Smart
2020-08-31 23:15 ` Sagi Grimberg
2020-09-01 15:39 ` James Smart
2020-09-01 22:01 ` Sagi Grimberg [this message]
2020-09-02 0:01 ` James Smart
2020-09-08 17:47 ` Christoph Hellwig
2020-09-08 17:47 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7e78b2e6-d103-23a1-a9ab-d12336a9c089@grimberg.me \
--to=sagi@grimberg.me \
--cc=hch@lst.de \
--cc=israelr@mellanox.com \
--cc=james.smart@broadcom.com \
--cc=kbusch@kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=maxg@mellanox.com \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).