From mboxrd@z Thu Jan 1 00:00:00 1970 From: hare@suse.de (Hannes Reinecke) Date: Fri, 14 Dec 2018 11:18:07 +0100 Subject: [PATCH 4/4] block: expose devt for GENHD_FL_HIDDEN disks In-Reply-To: References: <20181206164812.30925-1-cascardo@canonical.com> <20181206164812.30925-5-cascardo@canonical.com> <20181213143218.GA8723@lst.de> <20181213152532.GA5321@calabresa> <35acb1b3-77f5-29cf-b92d-5171f4ad6450@suse.de> <20181214085606.GD5321@calabresa> <20181214090656.GE5321@calabresa> Message-ID: <2a74bee9-a0c6-c9dd-2a4a-1d4605bfd1cd@suse.de> On 12/14/18 10:54 AM, Hannes Reinecke wrote: > On 12/14/18 10:06 AM, Thadeu Lima de Souza Cascardo wrote: >> On Fri, Dec 14, 2018 at 06:56:06AM -0200, Thadeu Lima de Souza >> Cascardo wrote: >>> On Fri, Dec 14, 2018@08:47:20AM +0100, Hannes Reinecke wrote: >>>> But you haven't answered my question: >>>> >>>> Why can't we patch 'lsblk' to provide the required information even >>>> with the >>>> current sysfs layout? >>>> >>> >> >> Just to be clear here. If with 'current sysfs layout' you mean without >> any of >> the patches we have been talking about, lsblk is not broken. It just >> works with >> nvme multipath enabled. It will show the multipath paths and simply >> ignore the >> underlying/hidden ones. If we hid them, we meant for them to be >> hidden, right? >> >> What I am trying to fix here is how to find out which PCI >> device/driver is >> needed to get to the block device holding the root filesystem, which >> is what >> initramfs needs. And the nvme multipath device is a virtual device, >> pointing to >> no driver at all, and no relation to its underlying devices, needed >> for it to >> work. >> > > Well ... > But this is an entirely different proposition. > The 'slaves'/'holders' trick just allows to map the relationship between > _block_ devices, which arguably is a bit pointless here seeing that we > don't actually have block devices for the underlying devices. > But even if we _were_ implementing that you would still fail to get to > the PCI device providing the block devices as there is no link pointing > from one to another. > > With the currently layout we have this hierarchy: > > NVMe namespace (/dev/nvmeXn1Y) -> NVMe-subsys -> NVMe controller > > and the NVMe controller is missing a link pointing to the device > presenting the controller: > > # ls -l /sys/devices/virtual/nvme-fabrics/ctl/nvme2 > total 0 > -r--r--r-- 1 root root 4096 Dec 13 13:18 address > -r--r--r-- 1 root root 4096 Dec 13 13:18 cntlid > --w------- 1 root root 4096 Dec 13 13:18 delete_controller > -r--r--r-- 1 root root 4096 Dec 13 13:18 dev > lrwxrwxrwx 1 root root??? 0 Dec 13 13:18 device -> ../../ctl > -r--r--r-- 1 root root 4096 Dec 13 13:18 firmware_rev > -r--r--r-- 1 root root 4096 Dec 13 13:18 model > drwxr-xr-x 9 root root??? 0 Dec? 3 13:55 nvme2c64n1 > drwxr-xr-x 2 root root??? 0 Dec 13 13:18 power > --w------- 1 root root 4096 Dec 13 13:18 rescan_controller > --w------- 1 root root 4096 Dec 13 13:18 reset_controller > -r--r--r-- 1 root root 4096 Dec 13 13:18 serial > -r--r--r-- 1 root root 4096 Dec 13 13:18 state > -r--r--r-- 1 root root 4096 Dec 13 13:18 subsysnqn > lrwxrwxrwx 1 root root??? 0 Dec? 3 13:44 subsystem -> > ../../../../../class/nvme > -r--r--r-- 1 root root 4096 Dec 13 13:18 transport > -rw-r--r-- 1 root root 4096 Dec 13 13:18 uevent > > So what we need to do is to update the 'device' link to point to the PCI > device providing the controller. > (Actually, we would need to point the 'device' link to point to the > entity providing the transport address, but I guess we don't have that > for now.) > > And _that's_ what we need to fix; the slaves/holders stuff doesn't solve > the underlying problem, and really shouldn't be merged at all. > Mind you, it _does_ work for PCI-NVMe: # ls -l /sys/class/nvme/nvme0 total 0 -r--r--r-- 1 root root 4096 Dec 14 11:14 cntlid -r--r--r-- 1 root root 4096 Dec 14 11:14 dev lrwxrwxrwx 1 root root 0 Dec 14 11:14 device -> ../../../0000:45:00.0 -r--r--r-- 1 root root 4096 Dec 14 11:14 firmware_rev -r--r--r-- 1 root root 4096 Dec 14 11:14 model drwxr-xr-x 12 root root 0 Dec 3 13:43 nvme1n1 drwxr-xr-x 2 root root 0 Dec 14 11:14 power --w------- 1 root root 4096 Dec 14 11:14 rescan_controller --w------- 1 root root 4096 Dec 14 11:14 reset_controller -r--r--r-- 1 root root 4096 Dec 14 11:14 serial -r--r--r-- 1 root root 4096 Dec 14 11:14 state -r--r--r-- 1 root root 4096 Dec 14 11:14 subsysnqn lrwxrwxrwx 1 root root 0 Dec 3 13:43 subsystem -> ../../../../../../class/nvme -r--r--r-- 1 root root 4096 Dec 14 11:14 transport -rw-r--r-- 1 root root 4096 Dec 14 11:14 uevent So it might be as simple as this patch: diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c index feb86b59170e..1ecdec6b8b4a 100644 --- a/drivers/nvme/host/fc.c +++ b/drivers/nvme/host/fc.c @@ -3117,7 +3117,7 @@ nvme_fc_init_ctrl(struct device *dev, struct nvmf_ctrl_options *opts, * Defer this to the connect path. */ - ret = nvme_init_ctrl(&ctrl->ctrl, dev, &nvme_fc_ctrl_ops, 0); + ret = nvme_init_ctrl(&ctrl->ctrl, ctrl->dev, &nvme_fc_ctrl_ops, 0); if (ret) goto out_cleanup_admin_q; As for RDMA / TCP we're running on a network address which really isn't tied to a specific device, so we wouldn't have any device to hook on without some trickery. Cheers, Hannes -- Dr. Hannes Reinecke Teamlead Storage & Networking hare at suse.de +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N?rnberg GF: F. Imend?rffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB 21284 (AG N?rnberg)