* Unable to reconnect namespace via NVMe/TCP @ 2025-08-12 15:48 Anton Gavriliuk 2025-08-13 6:01 ` Nilay Shroff ` (2 more replies) 0 siblings, 3 replies; 9+ messages in thread From: Anton Gavriliuk @ 2025-08-12 15:48 UTC (permalink / raw) To: linux-nvme Hi There are NVMe/TCP target and initiator servers, both running on RHEL10 (6.12.0-55.25.1.el10_0.x86_64) NVMe/TCP target exports single NVMe SSD "namespaces": [ { "device": { "nguid": "01000000-0000-0000-8ce3-8ee3064aa4f2", "path": "/dev/nvme0n1" }, "enable": 1, "nsid": 1 } ], If NVMe/TCP target is not available, initiator tries to reconnect every 10 seconds [ 2586.071048] nvme nvme9: failed to connect socket: -111 [ 2586.071403] nvme nvme9: Failed reconnect attempt 16/-1 [ 2586.071565] nvme nvme9: Reconnecting in 10 seconds... [ 2596.310921] nvme nvme9: failed to connect socket: -111 [ 2596.311186] nvme nvme9: Failed reconnect attempt 17/-1 [ 2596.311349] nvme nvme9: Reconnecting in 10 seconds... [ 2606.550772] nvme nvme9: failed to connect socket: -111 [ 2606.551252] nvme nvme9: Failed reconnect attempt 18/-1 [ 2606.551592] nvme nvme9: Reconnecting in 10 seconds... when NVMe/TCP target become available, initiator failed reconnect the namespace [ 2606.551592] nvme nvme9: Reconnecting in 10 seconds... [ 2616.793080] nvme nvme9: creating 16 I/O queues. [ 2616.829881] nvme nvme9: mapped 16/0/0 default/read/poll queues. [ 2616.833685] nvme nvme9: Successfully reconnected (attempt 19/-1) [ 2616.834446] nvme nvme9: identifiers changed for nsid 1 [ 2616.835618] block nvme9n1: no usable path - requeuing I/O [ 2616.856602] block nvme9n1: no available path - failing I/O [ 2616.856811] block nvme9n1: no available path - failing I/O and there is no nvme9n1 namespace in the "nvme list" output. Anton ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Unable to reconnect namespace via NVMe/TCP 2025-08-12 15:48 Unable to reconnect namespace via NVMe/TCP Anton Gavriliuk @ 2025-08-13 6:01 ` Nilay Shroff 2025-08-18 20:58 ` Chris Leech 2025-08-19 6:09 ` Hannes Reinecke 2 siblings, 0 replies; 9+ messages in thread From: Nilay Shroff @ 2025-08-13 6:01 UTC (permalink / raw) To: Anton Gavriliuk, linux-nvme On 8/12/25 9:18 PM, Anton Gavriliuk wrote: > Hi > > There are NVMe/TCP target and initiator servers, both running on > RHEL10 (6.12.0-55.25.1.el10_0.x86_64) > > NVMe/TCP target exports single NVMe SSD > > "namespaces": [ > { > "device": { > "nguid": "01000000-0000-0000-8ce3-8ee3064aa4f2", > "path": "/dev/nvme0n1" > }, > "enable": 1, > "nsid": 1 > } > ], > > If NVMe/TCP target is not available, initiator tries to reconnect > every 10 seconds > > [ 2586.071048] nvme nvme9: failed to connect socket: -111 > [ 2586.071403] nvme nvme9: Failed reconnect attempt 16/-1 > [ 2586.071565] nvme nvme9: Reconnecting in 10 seconds... > [ 2596.310921] nvme nvme9: failed to connect socket: -111 > [ 2596.311186] nvme nvme9: Failed reconnect attempt 17/-1 > [ 2596.311349] nvme nvme9: Reconnecting in 10 seconds... > [ 2606.550772] nvme nvme9: failed to connect socket: -111 > [ 2606.551252] nvme nvme9: Failed reconnect attempt 18/-1 > [ 2606.551592] nvme nvme9: Reconnecting in 10 seconds... > > when NVMe/TCP target become available, initiator failed reconnect the namespace > > [ 2606.551592] nvme nvme9: Reconnecting in 10 seconds... > [ 2616.793080] nvme nvme9: creating 16 I/O queues. > [ 2616.829881] nvme nvme9: mapped 16/0/0 default/read/poll queues. > [ 2616.833685] nvme nvme9: Successfully reconnected (attempt 19/-1) > [ 2616.834446] nvme nvme9: identifiers changed for nsid 1 It seems after the initiator re-connected to the target, the namespace identifiers (UUID/NGUID/EUI64 or CSI(command set identifiers)) have changed for nsid 1. So host kernel removed this (nvme9n1) namespace. > [ 2616.835618] block nvme9n1: no usable path - requeuing I/O > [ 2616.856602] block nvme9n1: no available path - failing I/O > [ 2616.856811] block nvme9n1: no available path - failing I/O > > and there is no nvme9n1 namespace in the "nvme list" output. > So you may check on the target about what could have possibly changed during reconnect time window for nsid 1. As a side note, it’s generally best to reproduce and report issues with an upstream kernel when posting to the kernel mailing list. This helps ensure the report gets prompt attention and makes it easier for others to debug and assist. Thanks, --Nilay ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Unable to reconnect namespace via NVMe/TCP 2025-08-12 15:48 Unable to reconnect namespace via NVMe/TCP Anton Gavriliuk 2025-08-13 6:01 ` Nilay Shroff @ 2025-08-18 20:58 ` Chris Leech 2025-08-19 13:12 ` Maurizio Lombardi 2025-08-19 13:45 ` Anton Gavriliuk 2025-08-19 6:09 ` Hannes Reinecke 2 siblings, 2 replies; 9+ messages in thread From: Chris Leech @ 2025-08-18 20:58 UTC (permalink / raw) To: Anton Gavriliuk; +Cc: linux-nvme On Tue, Aug 12, 2025 at 06:48:29PM +0300, Anton Gavriliuk wrote: > Hi > > There are NVMe/TCP target and initiator servers, both running on > RHEL10 (6.12.0-55.25.1.el10_0.x86_64) > > NVMe/TCP target exports single NVMe SSD > > "namespaces": [ > { > "device": { > "nguid": "01000000-0000-0000-8ce3-8ee3064aa4f2", > "path": "/dev/nvme0n1" > }, > "enable": 1, > "nsid": 1 > } > ], > > If NVMe/TCP target is not available, initiator tries to reconnect > every 10 seconds How is the target becoming unavailable? Is it as network interruption, or is the target being rebooted or reconfigured? Is the shared snippet the entire "namespaces" section of the configuration file? It looks like the target code will generate a random uuid for a device when it's configured, which could then trip up the host attempting to reconnect across a target reboot. But I think the uuid can be saved as well as the nguid in the target configuration. - Chris Leech ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Unable to reconnect namespace via NVMe/TCP 2025-08-18 20:58 ` Chris Leech @ 2025-08-19 13:12 ` Maurizio Lombardi 2025-08-19 13:45 ` Anton Gavriliuk 1 sibling, 0 replies; 9+ messages in thread From: Maurizio Lombardi @ 2025-08-19 13:12 UTC (permalink / raw) To: Chris Leech, Anton Gavriliuk; +Cc: linux-nvme On Mon Aug 18, 2025 at 10:58 PM CEST, Chris Leech wrote: > On Tue, Aug 12, 2025 at 06:48:29PM +0300, Anton Gavriliuk wrote: >> Hi >> >> There are NVMe/TCP target and initiator servers, both running on >> RHEL10 (6.12.0-55.25.1.el10_0.x86_64) >> >> NVMe/TCP target exports single NVMe SSD >> >> "namespaces": [ >> { >> "device": { >> "nguid": "01000000-0000-0000-8ce3-8ee3064aa4f2", >> "path": "/dev/nvme0n1" >> }, >> "enable": 1, >> "nsid": 1 >> } >> ], >> >> If NVMe/TCP target is not available, initiator tries to reconnect >> every 10 seconds > > How is the target becoming unavailable? Is it as network interruption, > or is the target being rebooted or reconfigured? > > Is the shared snippet the entire "namespaces" section of the > configuration file? > > It looks like the target code will generate a random uuid for a device > when it's configured, which could then trip up the host attempting to > reconnect across a target reboot. But I think the uuid can be saved as > well as the nguid in the target configuration. Indeed, normally, nvmetcli saves the uuid in the config file. Maurizio ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Unable to reconnect namespace via NVMe/TCP 2025-08-18 20:58 ` Chris Leech 2025-08-19 13:12 ` Maurizio Lombardi @ 2025-08-19 13:45 ` Anton Gavriliuk 1 sibling, 0 replies; 9+ messages in thread From: Anton Gavriliuk @ 2025-08-19 13:45 UTC (permalink / raw) To: Chris Leech; +Cc: linux-nvme > How is the target becoming unavailable? Is it as network interruption, or is the target being rebooted or reconfigured? NVMe/TCP target reboot. > Is the shared snippet the entire "namespaces" section of the configuration file? Yes, correct. Anton пн, 18 авг. 2025 г. в 23:58, Chris Leech <cleech@redhat.com>: > > On Tue, Aug 12, 2025 at 06:48:29PM +0300, Anton Gavriliuk wrote: > > Hi > > > > There are NVMe/TCP target and initiator servers, both running on > > RHEL10 (6.12.0-55.25.1.el10_0.x86_64) > > > > NVMe/TCP target exports single NVMe SSD > > > > "namespaces": [ > > { > > "device": { > > "nguid": "01000000-0000-0000-8ce3-8ee3064aa4f2", > > "path": "/dev/nvme0n1" > > }, > > "enable": 1, > > "nsid": 1 > > } > > ], > > > > If NVMe/TCP target is not available, initiator tries to reconnect > > every 10 seconds > > How is the target becoming unavailable? Is it as network interruption, > or is the target being rebooted or reconfigured? > > Is the shared snippet the entire "namespaces" section of the > configuration file? > > It looks like the target code will generate a random uuid for a device > when it's configured, which could then trip up the host attempting to > reconnect across a target reboot. But I think the uuid can be saved as > well as the nguid in the target configuration. > > - Chris Leech > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Unable to reconnect namespace via NVMe/TCP 2025-08-12 15:48 Unable to reconnect namespace via NVMe/TCP Anton Gavriliuk 2025-08-13 6:01 ` Nilay Shroff 2025-08-18 20:58 ` Chris Leech @ 2025-08-19 6:09 ` Hannes Reinecke 2025-08-19 10:55 ` Yi Zhang 2 siblings, 1 reply; 9+ messages in thread From: Hannes Reinecke @ 2025-08-19 6:09 UTC (permalink / raw) To: linux-nvme On 8/12/25 17:48, Anton Gavriliuk wrote: > Hi > > There are NVMe/TCP target and initiator servers, both running on > RHEL10 (6.12.0-55.25.1.el10_0.x86_64) > > NVMe/TCP target exports single NVMe SSD > > "namespaces": [ > { > "device": { > "nguid": "01000000-0000-0000-8ce3-8ee3064aa4f2", > "path": "/dev/nvme0n1" > }, > "enable": 1, > "nsid": 1 > } > ], > > If NVMe/TCP target is not available, initiator tries to reconnect > every 10 seconds > > [ 2586.071048] nvme nvme9: failed to connect socket: -111 > [ 2586.071403] nvme nvme9: Failed reconnect attempt 16/-1 > [ 2586.071565] nvme nvme9: Reconnecting in 10 seconds... > [ 2596.310921] nvme nvme9: failed to connect socket: -111 > [ 2596.311186] nvme nvme9: Failed reconnect attempt 17/-1 > [ 2596.311349] nvme nvme9: Reconnecting in 10 seconds... > [ 2606.550772] nvme nvme9: failed to connect socket: -111 > [ 2606.551252] nvme nvme9: Failed reconnect attempt 18/-1 > [ 2606.551592] nvme nvme9: Reconnecting in 10 seconds... > > when NVMe/TCP target become available, initiator failed reconnect the namespace > > [ 2606.551592] nvme nvme9: Reconnecting in 10 seconds... > [ 2616.793080] nvme nvme9: creating 16 I/O queues. > [ 2616.829881] nvme nvme9: mapped 16/0/0 default/read/poll queues. > [ 2616.833685] nvme nvme9: Successfully reconnected (attempt 19/-1) > [ 2616.834446] nvme nvme9: identifiers changed for nsid 1 > [ 2616.835618] block nvme9n1: no usable path - requeuing I/O > [ 2616.856602] block nvme9n1: no available path - failing I/O > [ 2616.856811] block nvme9n1: no available path - failing I/O > > and there is no nvme9n1 namespace in the "nvme list" output. > This looks like the missed re-scan issue I found recently. Should be fixed with 9546ad1a9bda ("nvme: requeue namespace scan on missed AENs") (And you are running RHEL. Please open a bugzilla with RH.) (And why am I even answering that?) Cheers, Hannes -- Dr. Hannes Reinecke Kernel Storage Architect hare@suse.de +49 911 74053 688 SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Unable to reconnect namespace via NVMe/TCP 2025-08-19 6:09 ` Hannes Reinecke @ 2025-08-19 10:55 ` Yi Zhang 2025-08-19 13:43 ` Yi Zhang 0 siblings, 1 reply; 9+ messages in thread From: Yi Zhang @ 2025-08-19 10:55 UTC (permalink / raw) To: Hannes Reinecke; +Cc: linux-nvme, Chris Leech, Maurizio Lombardi Hi Hannes I tried with the upstream kernel v6.17-rc2, and it can still be reproduced. # dmesg | tail -30 [ 219.560691] nvme nvme0: Failed reconnect attempt 3/-1 [ 219.565784] nvme nvme0: Reconnecting in 10 seconds... [ 229.795215] nvme nvme0: failed to connect socket: -111 [ 229.800369] nvme nvme0: Failed reconnect attempt 4/-1 [ 229.805450] nvme nvme0: Reconnecting in 10 seconds... [ 240.034918] nvme nvme0: failed to connect socket: -111 [ 240.040093] nvme nvme0: Failed reconnect attempt 5/-1 [ 240.045165] nvme nvme0: Reconnecting in 10 seconds... [ 250.274619] nvme nvme0: failed to connect socket: -111 [ 250.279776] nvme nvme0: Failed reconnect attempt 6/-1 [ 250.284855] nvme nvme0: Reconnecting in 10 seconds... [ 260.514102] nvme nvme0: failed to connect socket: -111 [ 260.519261] nvme nvme0: Failed reconnect attempt 7/-1 [ 260.524340] nvme nvme0: Reconnecting in 10 seconds... [ 270.754031] nvme nvme0: failed to connect socket: -111 [ 270.759184] nvme nvme0: Failed reconnect attempt 8/-1 [ 270.764263] nvme nvme0: Reconnecting in 10 seconds... [ 280.993410] nvme nvme0: failed to connect socket: -111 [ 280.998591] nvme nvme0: Failed reconnect attempt 9/-1 [ 281.003653] nvme nvme0: Reconnecting in 10 seconds... [ 291.249090] nvme nvme0: creating 4 I/O queues. [ 291.264959] nvme nvme0: mapped 4/0/0 default/read/poll queues. [ 291.271975] nvme nvme0: Successfully reconnected (attempt 10/-1) [ 291.273897] nvme nvme0: identifiers changed for nsid 2 [ 291.283631] block nvme0n1: no available path - failing I/O [ 291.289139] block nvme0n1: no available path - failing I/O [ 291.294649] block nvme0n1: no available path - failing I/O [ 291.300159] block nvme0n1: no available path - failing I/O [ 291.305665] block nvme0n1: no available path - failing I/O [ 291.311197] block nvme0n1: no available path - failing I/O On Tue, Aug 19, 2025 at 2:11 PM Hannes Reinecke <hare@suse.de> wrote: > > On 8/12/25 17:48, Anton Gavriliuk wrote: > > Hi > > > > There are NVMe/TCP target and initiator servers, both running on > > RHEL10 (6.12.0-55.25.1.el10_0.x86_64) > > > > NVMe/TCP target exports single NVMe SSD > > > > "namespaces": [ > > { > > "device": { > > "nguid": "01000000-0000-0000-8ce3-8ee3064aa4f2", > > "path": "/dev/nvme0n1" > > }, > > "enable": 1, > > "nsid": 1 > > } > > ], > > > > If NVMe/TCP target is not available, initiator tries to reconnect > > every 10 seconds > > > > [ 2586.071048] nvme nvme9: failed to connect socket: -111 > > [ 2586.071403] nvme nvme9: Failed reconnect attempt 16/-1 > > [ 2586.071565] nvme nvme9: Reconnecting in 10 seconds... > > [ 2596.310921] nvme nvme9: failed to connect socket: -111 > > [ 2596.311186] nvme nvme9: Failed reconnect attempt 17/-1 > > [ 2596.311349] nvme nvme9: Reconnecting in 10 seconds... > > [ 2606.550772] nvme nvme9: failed to connect socket: -111 > > [ 2606.551252] nvme nvme9: Failed reconnect attempt 18/-1 > > [ 2606.551592] nvme nvme9: Reconnecting in 10 seconds... > > > > when NVMe/TCP target become available, initiator failed reconnect the namespace > > > > [ 2606.551592] nvme nvme9: Reconnecting in 10 seconds... > > [ 2616.793080] nvme nvme9: creating 16 I/O queues. > > [ 2616.829881] nvme nvme9: mapped 16/0/0 default/read/poll queues. > > [ 2616.833685] nvme nvme9: Successfully reconnected (attempt 19/-1) > > [ 2616.834446] nvme nvme9: identifiers changed for nsid 1 > > [ 2616.835618] block nvme9n1: no usable path - requeuing I/O > > [ 2616.856602] block nvme9n1: no available path - failing I/O > > [ 2616.856811] block nvme9n1: no available path - failing I/O > > > > and there is no nvme9n1 namespace in the "nvme list" output. > > > This looks like the missed re-scan issue I found recently. > Should be fixed with > 9546ad1a9bda ("nvme: requeue namespace scan on missed AENs") > > (And you are running RHEL. Please open a bugzilla with RH.) > (And why am I even answering that?) > > Cheers, > > Hannes > -- > Dr. Hannes Reinecke Kernel Storage Architect > hare@suse.de +49 911 74053 688 > SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg > HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich > -- Best Regards, Yi Zhang ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Unable to reconnect namespace via NVMe/TCP 2025-08-19 10:55 ` Yi Zhang @ 2025-08-19 13:43 ` Yi Zhang 2025-08-19 14:29 ` Anton Gavriliuk 0 siblings, 1 reply; 9+ messages in thread From: Yi Zhang @ 2025-08-19 13:43 UTC (permalink / raw) To: Hannes Reinecke, Anton Gavriliuk Cc: linux-nvme, Chris Leech, Maurizio Lombardi Hi Anton Please try to add the uuid in the device field, which should fix your issue. On Tue, Aug 19, 2025 at 6:55 PM Yi Zhang <yi.zhang@redhat.com> wrote: > > Hi Hannes > > I tried with the upstream kernel v6.17-rc2, and it can still be reproduced. > > # dmesg | tail -30 > [ 219.560691] nvme nvme0: Failed reconnect attempt 3/-1 > [ 219.565784] nvme nvme0: Reconnecting in 10 seconds... > [ 229.795215] nvme nvme0: failed to connect socket: -111 > [ 229.800369] nvme nvme0: Failed reconnect attempt 4/-1 > [ 229.805450] nvme nvme0: Reconnecting in 10 seconds... > [ 240.034918] nvme nvme0: failed to connect socket: -111 > [ 240.040093] nvme nvme0: Failed reconnect attempt 5/-1 > [ 240.045165] nvme nvme0: Reconnecting in 10 seconds... > [ 250.274619] nvme nvme0: failed to connect socket: -111 > [ 250.279776] nvme nvme0: Failed reconnect attempt 6/-1 > [ 250.284855] nvme nvme0: Reconnecting in 10 seconds... > [ 260.514102] nvme nvme0: failed to connect socket: -111 > [ 260.519261] nvme nvme0: Failed reconnect attempt 7/-1 > [ 260.524340] nvme nvme0: Reconnecting in 10 seconds... > [ 270.754031] nvme nvme0: failed to connect socket: -111 > [ 270.759184] nvme nvme0: Failed reconnect attempt 8/-1 > [ 270.764263] nvme nvme0: Reconnecting in 10 seconds... > [ 280.993410] nvme nvme0: failed to connect socket: -111 > [ 280.998591] nvme nvme0: Failed reconnect attempt 9/-1 > [ 281.003653] nvme nvme0: Reconnecting in 10 seconds... > [ 291.249090] nvme nvme0: creating 4 I/O queues. > [ 291.264959] nvme nvme0: mapped 4/0/0 default/read/poll queues. > [ 291.271975] nvme nvme0: Successfully reconnected (attempt 10/-1) > [ 291.273897] nvme nvme0: identifiers changed for nsid 2 > [ 291.283631] block nvme0n1: no available path - failing I/O > [ 291.289139] block nvme0n1: no available path - failing I/O > [ 291.294649] block nvme0n1: no available path - failing I/O > [ 291.300159] block nvme0n1: no available path - failing I/O > [ 291.305665] block nvme0n1: no available path - failing I/O > [ 291.311197] block nvme0n1: no available path - failing I/O > > On Tue, Aug 19, 2025 at 2:11 PM Hannes Reinecke <hare@suse.de> wrote: > > > > On 8/12/25 17:48, Anton Gavriliuk wrote: > > > Hi > > > > > > There are NVMe/TCP target and initiator servers, both running on > > > RHEL10 (6.12.0-55.25.1.el10_0.x86_64) > > > > > > NVMe/TCP target exports single NVMe SSD > > > > > > "namespaces": [ > > > { > > > "device": { > > > "nguid": "01000000-0000-0000-8ce3-8ee3064aa4f2", > > > "path": "/dev/nvme0n1" > > > }, > > > "enable": 1, > > > "nsid": 1 > > > } > > > ], > > > > > > If NVMe/TCP target is not available, initiator tries to reconnect > > > every 10 seconds > > > > > > [ 2586.071048] nvme nvme9: failed to connect socket: -111 > > > [ 2586.071403] nvme nvme9: Failed reconnect attempt 16/-1 > > > [ 2586.071565] nvme nvme9: Reconnecting in 10 seconds... > > > [ 2596.310921] nvme nvme9: failed to connect socket: -111 > > > [ 2596.311186] nvme nvme9: Failed reconnect attempt 17/-1 > > > [ 2596.311349] nvme nvme9: Reconnecting in 10 seconds... > > > [ 2606.550772] nvme nvme9: failed to connect socket: -111 > > > [ 2606.551252] nvme nvme9: Failed reconnect attempt 18/-1 > > > [ 2606.551592] nvme nvme9: Reconnecting in 10 seconds... > > > > > > when NVMe/TCP target become available, initiator failed reconnect the namespace > > > > > > [ 2606.551592] nvme nvme9: Reconnecting in 10 seconds... > > > [ 2616.793080] nvme nvme9: creating 16 I/O queues. > > > [ 2616.829881] nvme nvme9: mapped 16/0/0 default/read/poll queues. > > > [ 2616.833685] nvme nvme9: Successfully reconnected (attempt 19/-1) > > > [ 2616.834446] nvme nvme9: identifiers changed for nsid 1 > > > [ 2616.835618] block nvme9n1: no usable path - requeuing I/O > > > [ 2616.856602] block nvme9n1: no available path - failing I/O > > > [ 2616.856811] block nvme9n1: no available path - failing I/O > > > > > > and there is no nvme9n1 namespace in the "nvme list" output. > > > > > This looks like the missed re-scan issue I found recently. > > Should be fixed with > > 9546ad1a9bda ("nvme: requeue namespace scan on missed AENs") > > > > (And you are running RHEL. Please open a bugzilla with RH.) > > (And why am I even answering that?) > > > > Cheers, > > > > Hannes > > -- > > Dr. Hannes Reinecke Kernel Storage Architect > > hare@suse.de +49 911 74053 688 > > SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg > > HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich > > > > > -- > Best Regards, > Yi Zhang -- Best Regards, Yi Zhang ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Unable to reconnect namespace via NVMe/TCP 2025-08-19 13:43 ` Yi Zhang @ 2025-08-19 14:29 ` Anton Gavriliuk 0 siblings, 0 replies; 9+ messages in thread From: Anton Gavriliuk @ 2025-08-19 14:29 UTC (permalink / raw) To: Yi Zhang; +Cc: Hannes Reinecke, linux-nvme, Chris Leech, Maurizio Lombardi Hi Yi Zhang > Please try to add the uuid in the device field, which should fix your issue. On the NVMe/TCP target for the given device (/dev/nvme0n1) nguid and uuid are exactly the same [root@memverge4 ~]# cat /sys/class/block/nvme0n1/uuid 01000000-0000-0000-8ce3-8ee3064aa4f2 [root@memverge4 ~]# cat /sys/class/block/nvme0n1/nguid 01000000-0000-0000-8ce3-8ee3064aa4f2 So I added uuid "namespaces": [ { "device": { "nguid": "01000000-0000-0000-8ce3-8ee3064aa4f2", "uuid": "01000000-0000-0000-8ce3-8ee3064aa4f2", "path": "/dev/nvme0n1" }, "enable": 1, "nsid": 1 } ], Yes, this fixed my issue - automatically reconnect namespace after NVMe/TCP target reboot. Anton вт, 19 авг. 2025 г. в 16:43, Yi Zhang <yi.zhang@redhat.com>: > > Hi Anton > > Please try to add the uuid in the device field, which should fix your issue. > > On Tue, Aug 19, 2025 at 6:55 PM Yi Zhang <yi.zhang@redhat.com> wrote: > > > > Hi Hannes > > > > I tried with the upstream kernel v6.17-rc2, and it can still be reproduced. > > > > # dmesg | tail -30 > > [ 219.560691] nvme nvme0: Failed reconnect attempt 3/-1 > > [ 219.565784] nvme nvme0: Reconnecting in 10 seconds... > > [ 229.795215] nvme nvme0: failed to connect socket: -111 > > [ 229.800369] nvme nvme0: Failed reconnect attempt 4/-1 > > [ 229.805450] nvme nvme0: Reconnecting in 10 seconds... > > [ 240.034918] nvme nvme0: failed to connect socket: -111 > > [ 240.040093] nvme nvme0: Failed reconnect attempt 5/-1 > > [ 240.045165] nvme nvme0: Reconnecting in 10 seconds... > > [ 250.274619] nvme nvme0: failed to connect socket: -111 > > [ 250.279776] nvme nvme0: Failed reconnect attempt 6/-1 > > [ 250.284855] nvme nvme0: Reconnecting in 10 seconds... > > [ 260.514102] nvme nvme0: failed to connect socket: -111 > > [ 260.519261] nvme nvme0: Failed reconnect attempt 7/-1 > > [ 260.524340] nvme nvme0: Reconnecting in 10 seconds... > > [ 270.754031] nvme nvme0: failed to connect socket: -111 > > [ 270.759184] nvme nvme0: Failed reconnect attempt 8/-1 > > [ 270.764263] nvme nvme0: Reconnecting in 10 seconds... > > [ 280.993410] nvme nvme0: failed to connect socket: -111 > > [ 280.998591] nvme nvme0: Failed reconnect attempt 9/-1 > > [ 281.003653] nvme nvme0: Reconnecting in 10 seconds... > > [ 291.249090] nvme nvme0: creating 4 I/O queues. > > [ 291.264959] nvme nvme0: mapped 4/0/0 default/read/poll queues. > > [ 291.271975] nvme nvme0: Successfully reconnected (attempt 10/-1) > > [ 291.273897] nvme nvme0: identifiers changed for nsid 2 > > [ 291.283631] block nvme0n1: no available path - failing I/O > > [ 291.289139] block nvme0n1: no available path - failing I/O > > [ 291.294649] block nvme0n1: no available path - failing I/O > > [ 291.300159] block nvme0n1: no available path - failing I/O > > [ 291.305665] block nvme0n1: no available path - failing I/O > > [ 291.311197] block nvme0n1: no available path - failing I/O > > > > On Tue, Aug 19, 2025 at 2:11 PM Hannes Reinecke <hare@suse.de> wrote: > > > > > > On 8/12/25 17:48, Anton Gavriliuk wrote: > > > > Hi > > > > > > > > There are NVMe/TCP target and initiator servers, both running on > > > > RHEL10 (6.12.0-55.25.1.el10_0.x86_64) > > > > > > > > NVMe/TCP target exports single NVMe SSD > > > > > > > > "namespaces": [ > > > > { > > > > "device": { > > > > "nguid": "01000000-0000-0000-8ce3-8ee3064aa4f2", > > > > "path": "/dev/nvme0n1" > > > > }, > > > > "enable": 1, > > > > "nsid": 1 > > > > } > > > > ], > > > > > > > > If NVMe/TCP target is not available, initiator tries to reconnect > > > > every 10 seconds > > > > > > > > [ 2586.071048] nvme nvme9: failed to connect socket: -111 > > > > [ 2586.071403] nvme nvme9: Failed reconnect attempt 16/-1 > > > > [ 2586.071565] nvme nvme9: Reconnecting in 10 seconds... > > > > [ 2596.310921] nvme nvme9: failed to connect socket: -111 > > > > [ 2596.311186] nvme nvme9: Failed reconnect attempt 17/-1 > > > > [ 2596.311349] nvme nvme9: Reconnecting in 10 seconds... > > > > [ 2606.550772] nvme nvme9: failed to connect socket: -111 > > > > [ 2606.551252] nvme nvme9: Failed reconnect attempt 18/-1 > > > > [ 2606.551592] nvme nvme9: Reconnecting in 10 seconds... > > > > > > > > when NVMe/TCP target become available, initiator failed reconnect the namespace > > > > > > > > [ 2606.551592] nvme nvme9: Reconnecting in 10 seconds... > > > > [ 2616.793080] nvme nvme9: creating 16 I/O queues. > > > > [ 2616.829881] nvme nvme9: mapped 16/0/0 default/read/poll queues. > > > > [ 2616.833685] nvme nvme9: Successfully reconnected (attempt 19/-1) > > > > [ 2616.834446] nvme nvme9: identifiers changed for nsid 1 > > > > [ 2616.835618] block nvme9n1: no usable path - requeuing I/O > > > > [ 2616.856602] block nvme9n1: no available path - failing I/O > > > > [ 2616.856811] block nvme9n1: no available path - failing I/O > > > > > > > > and there is no nvme9n1 namespace in the "nvme list" output. > > > > > > > This looks like the missed re-scan issue I found recently. > > > Should be fixed with > > > 9546ad1a9bda ("nvme: requeue namespace scan on missed AENs") > > > > > > (And you are running RHEL. Please open a bugzilla with RH.) > > > (And why am I even answering that?) > > > > > > Cheers, > > > > > > Hannes > > > -- > > > Dr. Hannes Reinecke Kernel Storage Architect > > > hare@suse.de +49 911 74053 688 > > > SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg > > > HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich > > > > > > > > > -- > > Best Regards, > > Yi Zhang > > > > -- > Best Regards, > Yi Zhang > ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2025-08-19 18:15 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-08-12 15:48 Unable to reconnect namespace via NVMe/TCP Anton Gavriliuk 2025-08-13 6:01 ` Nilay Shroff 2025-08-18 20:58 ` Chris Leech 2025-08-19 13:12 ` Maurizio Lombardi 2025-08-19 13:45 ` Anton Gavriliuk 2025-08-19 6:09 ` Hannes Reinecke 2025-08-19 10:55 ` Yi Zhang 2025-08-19 13:43 ` Yi Zhang 2025-08-19 14:29 ` Anton Gavriliuk
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).