From mboxrd@z Thu Jan  1 00:00:00 1970
From: hch@lst.de (Christoph Hellwig)
Date: Thu, 14 Jun 2018 14:42:37 +0200
Subject: please revert a nvme-cli commit
In-Reply-To: <64647508-fd1f-9ec2-171d-88ba2a5653b7@broadcom.com>
References: <20180613075557.GA22940@lst.de>
 <64647508-fd1f-9ec2-171d-88ba2a5653b7@broadcom.com>
Message-ID: <20180614124237.GA28896@lst.de>

On Wed, Jun 13, 2018@08:51:38AM -0700, James Smart wrote:
> Really ?? sigh.? I have lots of consumers that have no issues with these 
> changes and there is nothing that acts "incompatible".?? It's been 1.5 
> months - where have you been?
>
> These conditions can occur independent of any change in kernel 
> implementation and are significant robustness corrections.

I don't agree.  For one so far we've guaranteed the device node
appears before we return from the write to the /dev/nvme-fabrics
device.  If that changes it is a significant ABI break, and we need
to fix that in the kernel.

And if it wasn't the fix is still wrong - we'd need to wait for it to
appear using libudev APIs and/or one of the file notification syscalls
rather than adding a probing loop that is in a different place than
the actual open.

And for retrying the actual I/O we need to decide on the exact semantics
we want to support first.  Blind n time retry is always a bad idea,
we need to build some sort of reliable infrastructure.  Be that
optionally marking requests as not failfast, and/or some sort of poll
notification for a device that is ready.