* [PATCH] nvme-multipath: sanitize nvme_update_ana_state()
@ 2019-07-16 7:18 Hannes Reinecke
2019-07-16 9:38 ` Christoph Hellwig
0 siblings, 1 reply; 6+ messages in thread
From: Hannes Reinecke @ 2019-07-16 7:18 UTC (permalink / raw)
Commit 04e70bd4a026 ("nvme-multipath: do not select namespaces
which are about to be removed") introduced checks when traversing
the list of namespaces to avoid tripping over invalid namespaces.
A similar check is needed in nvme_update_ana_state() to skip updates
for any namespaces which will be removed.
With that we should also remove the WARN_ON() at the end of the
iteration; this will also be errorneously triggered if ANA Change AENS
are received during scanning.
Signed-off-by: Hannes Reinecke <hare at suse.com>
---
drivers/nvme/host/multipath.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index a9a927677970..78415f2909cd 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -449,6 +449,8 @@ static int nvme_update_ana_state(struct nvme_ctrl *ctrl,
down_write(&ctrl->namespaces_rwsem);
list_for_each_entry(ns, &ctrl->namespaces, list) {
+ if (test_bit(NVME_NS_REMOVING, &ns->flags))
+ continue;
if (ns->head->ns_id != le32_to_cpu(desc->nsids[n]))
continue;
nvme_update_ns_ana_state(desc, ns);
@@ -456,7 +458,6 @@ static int nvme_update_ana_state(struct nvme_ctrl *ctrl,
break;
}
up_write(&ctrl->namespaces_rwsem);
- WARN_ON_ONCE(n < nr_nsids);
return 0;
}
--
2.16.4
^ permalink raw reply related [flat|nested] 6+ messages in thread* [PATCH] nvme-multipath: sanitize nvme_update_ana_state()
2019-07-16 7:18 [PATCH] nvme-multipath: sanitize nvme_update_ana_state() Hannes Reinecke
@ 2019-07-16 9:38 ` Christoph Hellwig
2019-07-16 10:02 ` Hannes Reinecke
0 siblings, 1 reply; 6+ messages in thread
From: Christoph Hellwig @ 2019-07-16 9:38 UTC (permalink / raw)
On Tue, Jul 16, 2019@09:18:30AM +0200, Hannes Reinecke wrote:
> Commit 04e70bd4a026 ("nvme-multipath: do not select namespaces
> which are about to be removed") introduced checks when traversing
> the list of namespaces to avoid tripping over invalid namespaces.
> A similar check is needed in nvme_update_ana_state() to skip updates
> for any namespaces which will be removed.
> With that we should also remove the WARN_ON() at the end of the
> iteration; this will also be errorneously triggered if ANA Change AENS
> are received during scanning.
What is the problem we are trying to fix here? Just that we should
not call nvme_mpath_set_live? In that case it might make more sense
to move the removed check to just guard that call and at least keep
the values in struct nvme_ns uptodate.
^ permalink raw reply [flat|nested] 6+ messages in thread* [PATCH] nvme-multipath: sanitize nvme_update_ana_state()
2019-07-16 9:38 ` Christoph Hellwig
@ 2019-07-16 10:02 ` Hannes Reinecke
2019-07-17 14:38 ` Christoph Hellwig
0 siblings, 1 reply; 6+ messages in thread
From: Hannes Reinecke @ 2019-07-16 10:02 UTC (permalink / raw)
On 7/16/19 11:38 AM, Christoph Hellwig wrote:
> On Tue, Jul 16, 2019@09:18:30AM +0200, Hannes Reinecke wrote:
>> Commit 04e70bd4a026 ("nvme-multipath: do not select namespaces
>> which are about to be removed") introduced checks when traversing
>> the list of namespaces to avoid tripping over invalid namespaces.
>> A similar check is needed in nvme_update_ana_state() to skip updates
>> for any namespaces which will be removed.
>> With that we should also remove the WARN_ON() at the end of the
>> iteration; this will also be errorneously triggered if ANA Change AENS
>> are received during scanning.
>
> What is the problem we are trying to fix here? Just that we should
> not call nvme_mpath_set_live? In that case it might make more sense
> to move the removed check to just guard that call and at least keep
> the values in struct nvme_ns uptodate.
>
The underlying problem is the 'WARN_ON()', which actually can (and is)
triggered when ANA change AENs are received during controller (re-) connect.
At the same time it's quite pointless to update the ANA state for
namespaces which are on their way out; we _still_ seeing odd behaviour
during simultaneous rescan/reset, and this is just another safeguard.
But for now I'd be happy to drop the check for NS_REMOVING and just
delete the WARN_ON().
Cheers,
Hannes
--
Dr. Hannes Reinecke Teamlead Storage & Networking
hare at suse.de +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N?rnberg
GF: Felix Imend?rffer, Mary Higgins, Sri Rasiah
HRB 21284 (AG N?rnberg)
^ permalink raw reply [flat|nested] 6+ messages in thread* [PATCH] nvme-multipath: sanitize nvme_update_ana_state()
2019-07-16 10:02 ` Hannes Reinecke
@ 2019-07-17 14:38 ` Christoph Hellwig
2019-07-18 18:08 ` John Donnelly
0 siblings, 1 reply; 6+ messages in thread
From: Christoph Hellwig @ 2019-07-17 14:38 UTC (permalink / raw)
On Tue, Jul 16, 2019@12:02:15PM +0200, Hannes Reinecke wrote:
> The underlying problem is the 'WARN_ON()', which actually can (and is)
> triggered when ANA change AENs are received during controller (re-) connect.
> At the same time it's quite pointless to update the ANA state for
> namespaces which are on their way out; we _still_ seeing odd behaviour
> during simultaneous rescan/reset, and this is just another safeguard.
>
> But for now I'd be happy to drop the check for NS_REMOVING and just
> delete the WARN_ON().
Well, we should certainly skip the nvme_mpath_set_live as well for
a removing namespace. But yes, I'd prefer to move the check there
and then remove the WARN_ON over the big hammer.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH] nvme-multipath: sanitize nvme_update_ana_state()
2019-07-17 14:38 ` Christoph Hellwig
@ 2019-07-18 18:08 ` John Donnelly
2019-07-19 7:06 ` Hannes Reinecke
0 siblings, 1 reply; 6+ messages in thread
From: John Donnelly @ 2019-07-18 18:08 UTC (permalink / raw)
> On Jul 17, 2019,@9:38 AM, Christoph Hellwig <hch@lst.de> wrote:
>
> On Tue, Jul 16, 2019@12:02:15PM +0200, Hannes Reinecke wrote:
>> The underlying problem is the 'WARN_ON()', which actually can (and is)
>> triggered when ANA change AENs are received during controller (re-) connect.
>> At the same time it's quite pointless to update the ANA state for
>> namespaces which are on their way out; we _still_ seeing odd behaviour
>> during simultaneous rescan/reset, and this is just another safeguard.
>>
>> But for now I'd be happy to drop the check for NS_REMOVING and just
>> delete the WARN_ON().
>
> Well, we should certainly skip the nvme_mpath_set_live as well for
> a removing namespace. But yes, I'd prefer to move the check there
> and then remove the WARN_ON over the big hammer.
>
Hi
I happen to be seeing this WARNING in a port we are using while doing failover testing . If the WARNING is removed , should there be another logging message of a similar nature that alerts of a configuration issue has been detected ? Or is this a transient condition ?
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH] nvme-multipath: sanitize nvme_update_ana_state()
2019-07-18 18:08 ` John Donnelly
@ 2019-07-19 7:06 ` Hannes Reinecke
0 siblings, 0 replies; 6+ messages in thread
From: Hannes Reinecke @ 2019-07-19 7:06 UTC (permalink / raw)
On 7/18/19 8:08 PM, John Donnelly wrote:
>
>
>> On Jul 17, 2019,@9:38 AM, Christoph Hellwig <hch@lst.de> wrote:
>>
>> On Tue, Jul 16, 2019@12:02:15PM +0200, Hannes Reinecke wrote:
>>> The underlying problem is the 'WARN_ON()', which actually can (and is)
>>> triggered when ANA change AENs are received during controller (re-) connect.
>>> At the same time it's quite pointless to update the ANA state for
>>> namespaces which are on their way out; we _still_ seeing odd behaviour
>>> during simultaneous rescan/reset, and this is just another safeguard.
>>>
>>> But for now I'd be happy to drop the check for NS_REMOVING and just
>>> delete the WARN_ON().
>>
>> Well, we should certainly skip the nvme_mpath_set_live as well for
>> a removing namespace. But yes, I'd prefer to move the check there
>> and then remove the WARN_ON over the big hammer.
>>
>
> Hi
> I happen to be seeing this WARNING in a port we are using while doing
> failover testing . If the WARNING is removed , should there be another
> logging message of a similar nature that alerts of a configuration issue
> has been detected ? Or is this a transient condition ?
>
This is typically a transient condition.
If we receive an ANA chane AEN while scanning is in progress the ANA log
page will contain all namespaces, yet the scanning thread is still
processing and hasn't connected all namespaces. Hence we see this error.
Cheers,
Hannes
--
Dr. Hannes Reinecke Teamlead Storage & Networking
hare at suse.de +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N?rnberg
GF: Felix Imend?rffer, Mary Higgins, Sri Rasiah
HRB 21284 (AG N?rnberg)
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2019-07-19 7:06 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-07-16 7:18 [PATCH] nvme-multipath: sanitize nvme_update_ana_state() Hannes Reinecke
2019-07-16 9:38 ` Christoph Hellwig
2019-07-16 10:02 ` Hannes Reinecke
2019-07-17 14:38 ` Christoph Hellwig
2019-07-18 18:08 ` John Donnelly
2019-07-19 7:06 ` Hannes Reinecke
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox