All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Snitzer <snitzer@redhat.com>
To: Ming Lin <mlin@kernel.org>
Cc: device-mapper development <dm-devel@redhat.com>,
	linux-nvme@lists.infradead.org
Subject: Re: NVMeoF multi-path setup
Date: Thu, 30 Jun 2016 18:52:07 -0400	[thread overview]
Message-ID: <20160630225207.GB22293@redhat.com> (raw)
In-Reply-To: <1467323858.15863.3.camel@ssi>

On Thu, Jun 30 2016 at  5:57pm -0400,
Ming Lin <mlin@kernel.org> wrote:

> On Thu, 2016-06-30 at 14:08 -0700, Ming Lin wrote:
> > Hi Mike,
> > 
> > I'm trying to test NVMeoF multi-path.
> > 
> > root@host:~# lsmod |grep dm_multipath
> > dm_multipath           24576  0
> > root@host:~# ps aux |grep multipath
> > root     13183  0.0  0.1 238452  4972 ?        SLl  13:41   0:00
> > /sbin/multipathd
> > 
> > I have nvme0 and nvme1 that are 2 paths to the same NVMe subsystem.
> > 
> > root@host:/sys/class/nvme# grep . nvme*/address
> > nvme0/address:traddr=192.168.3.2,trsvcid=1023
> > nvme1/address:traddr=192.168.2.2,trsvcid=1023
> > 
> > root@host:/sys/class/nvme# grep . nvme*/subsysnqn
> > nvme0/subsysnqn:nqn.testiqn
> > nvme1/subsysnqn:nqn.testiqn
> > 
> > root@host:~# /lib/udev/scsi_id --export --whitelisted -d /dev/nvme1n1
> > ID_SCSI=1
> > ID_VENDOR=NVMe
> > ID_VENDOR_ENC=NVMe\x20\x20\x20\x20
> > ID_MODEL=Linux
> > ID_MODEL_ENC=Linux
> > ID_REVISION=0-rc
> > ID_TYPE=disk
> > ID_SERIAL=SNVMe_Linux
> > ID_SERIAL_SHORT=
> > ID_SCSI_SERIAL=1122334455667788
> > 
> > root@host:~# /lib/udev/scsi_id --export --whitelisted -d /dev/nvme0n1
> > ID_SCSI=1
> > ID_VENDOR=NVMe
> > ID_VENDOR_ENC=NVMe\x20\x20\x20\x20
> > ID_MODEL=Linux
> > ID_MODEL_ENC=Linux
> > ID_REVISION=0-rc
> > ID_TYPE=disk
> > ID_SERIAL=SNVMe_Linux
> > ID_SERIAL_SHORT=
> > ID_SCSI_SERIAL=1122334455667788
> > 
> > But seems multipathd didn't recognize these 2 devices.
> > 
> > What else I'm missing?
> 
> There are two problems:
> 
> 1. there is no "/block/" in the path
> 
> /sys/devices/virtual/nvme-fabrics/block/nvme0/nvme0n1

You clarified that it is:
/sys/devices/virtual/nvme-fabrics/ctl/nvme0/nvme0n1

Do you have CONFIG_BLK_DEV_NVME_SCSI enabled?

AFAIK, hch had Intel disable that by default in the hopes of avoiding
people having dm-multipath "just work" with NVMeoF.  (Makes me wonder
what other unpleasant unilateral decisions were made because some
non-existant NVMe specific multipath capabilities would be forthcoming
but I digress).

My understanding is that enabling CONFIG_BLK_DEV_NVME_SCSI will cause
NVMe to respond favorably to standard SCSI VPD inquiries.

And _yes_, Red Hat will be enabling it so users have options!

Also, just so you're aware, I've staged bio-based dm-multipath support
for the 4.8 merge window.  Please see either the 'for-next' or 'dm-4.8'
branch in linux-dm.git:
https://git.kernel.org/cgit/linux/kernel/git/device-mapper/linux-dm.git/log/?h=for-next
https://git.kernel.org/cgit/linux/kernel/git/device-mapper/linux-dm.git/log/?h=dm-4.8

I'd welcome you testing if bio-based dm-multipath performs better for
you than blk-mq request-based dm-multipath.  Both modes (using the 4.8
staged code) can be easily selected on a per DM multipath device table
by adding either: queue_mode=bio or queue_mode=mq

(made possible with this commit:
https://git.kernel.org/cgit/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=dm-4.8&id=e83068a5faafb8ca65d3b58bd1e1e3959ce1ddce
)

> 2. nvme was blacklisted.
> 
> I added below quick hack to just make it work.
> 
> root@host:~# cat /proc/partitions
> 
>  259        0  937692504 nvme0n1
>  252        0  937692504 dm-0
>  259        1  937692504 nvme1n1
> 
> diff --git a/libmultipath/blacklist.c b/libmultipath/blacklist.c
> index 2400eda..a143383 100644
> --- a/libmultipath/blacklist.c
> +++ b/libmultipath/blacklist.c
> @@ -190,9 +190,11 @@ setup_default_blist (struct config * conf)
>  	if (store_ble(conf->blist_devnode, str, ORIGIN_DEFAULT))
>  		return 1;
>  
> +#if 0
>  	str = STRDUP("^nvme.*");
>  	if (!str)
>  		return 1;
> +#endif
>  	if (store_ble(conf->blist_devnode, str, ORIGIN_DEFAULT))
>  		return 1;

That's weird, not sure why that'd be the case.. maybe because NVMeoF
hasn't been worked through to "just work" with multipath-tools
yet.. Ben? Hannes?

> diff --git a/multipathd/main.c b/multipathd/main.c
> index c0ca571..1364070 100644
> --- a/multipathd/main.c
> +++ b/multipathd/main.c
> @@ -1012,6 +1012,7 @@ uxsock_trigger (char * str, char ** reply, int * len, void * trigger_data)
>  static int
>  uev_discard(char * devpath)
>  {
> +#if 0
>  	char *tmp;
>  	char a[11], b[11];
>  
> @@ -1028,6 +1029,7 @@ uev_discard(char * devpath)
>  		condlog(4, "discard event on %s", devpath);
>  		return 1;
>  	}
> +#endif
>  	return 0;
>  }

Why did you have to comment out this discard code?

WARNING: multiple messages have this Message-ID (diff)
From: snitzer@redhat.com (Mike Snitzer)
Subject: NVMeoF multi-path setup
Date: Thu, 30 Jun 2016 18:52:07 -0400	[thread overview]
Message-ID: <20160630225207.GB22293@redhat.com> (raw)
In-Reply-To: <1467323858.15863.3.camel@ssi>

On Thu, Jun 30 2016 at  5:57pm -0400,
Ming Lin <mlin@kernel.org> wrote:

> On Thu, 2016-06-30@14:08 -0700, Ming Lin wrote:
> > Hi Mike,
> > 
> > I'm trying to test NVMeoF multi-path.
> > 
> > root at host:~# lsmod |grep dm_multipath
> > dm_multipath           24576  0
> > root at host:~# ps aux |grep multipath
> > root     13183  0.0  0.1 238452  4972 ?        SLl  13:41   0:00
> > /sbin/multipathd
> > 
> > I have nvme0 and nvme1 that are 2 paths to the same NVMe subsystem.
> > 
> > root at host:/sys/class/nvme# grep . nvme*/address
> > nvme0/address:traddr=192.168.3.2,trsvcid=1023
> > nvme1/address:traddr=192.168.2.2,trsvcid=1023
> > 
> > root at host:/sys/class/nvme# grep . nvme*/subsysnqn
> > nvme0/subsysnqn:nqn.testiqn
> > nvme1/subsysnqn:nqn.testiqn
> > 
> > root at host:~# /lib/udev/scsi_id --export --whitelisted -d /dev/nvme1n1
> > ID_SCSI=1
> > ID_VENDOR=NVMe
> > ID_VENDOR_ENC=NVMe\x20\x20\x20\x20
> > ID_MODEL=Linux
> > ID_MODEL_ENC=Linux
> > ID_REVISION=0-rc
> > ID_TYPE=disk
> > ID_SERIAL=SNVMe_Linux
> > ID_SERIAL_SHORT=
> > ID_SCSI_SERIAL=1122334455667788
> > 
> > root at host:~# /lib/udev/scsi_id --export --whitelisted -d /dev/nvme0n1
> > ID_SCSI=1
> > ID_VENDOR=NVMe
> > ID_VENDOR_ENC=NVMe\x20\x20\x20\x20
> > ID_MODEL=Linux
> > ID_MODEL_ENC=Linux
> > ID_REVISION=0-rc
> > ID_TYPE=disk
> > ID_SERIAL=SNVMe_Linux
> > ID_SERIAL_SHORT=
> > ID_SCSI_SERIAL=1122334455667788
> > 
> > But seems multipathd didn't recognize these 2 devices.
> > 
> > What else I'm missing?
> 
> There are two problems:
> 
> 1. there is no "/block/" in the path
> 
> /sys/devices/virtual/nvme-fabrics/block/nvme0/nvme0n1

You clarified that it is:
/sys/devices/virtual/nvme-fabrics/ctl/nvme0/nvme0n1

Do you have CONFIG_BLK_DEV_NVME_SCSI enabled?

AFAIK, hch had Intel disable that by default in the hopes of avoiding
people having dm-multipath "just work" with NVMeoF.  (Makes me wonder
what other unpleasant unilateral decisions were made because some
non-existant NVMe specific multipath capabilities would be forthcoming
but I digress).

My understanding is that enabling CONFIG_BLK_DEV_NVME_SCSI will cause
NVMe to respond favorably to standard SCSI VPD inquiries.

And _yes_, Red Hat will be enabling it so users have options!

Also, just so you're aware, I've staged bio-based dm-multipath support
for the 4.8 merge window.  Please see either the 'for-next' or 'dm-4.8'
branch in linux-dm.git:
https://git.kernel.org/cgit/linux/kernel/git/device-mapper/linux-dm.git/log/?h=for-next
https://git.kernel.org/cgit/linux/kernel/git/device-mapper/linux-dm.git/log/?h=dm-4.8

I'd welcome you testing if bio-based dm-multipath performs better for
you than blk-mq request-based dm-multipath.  Both modes (using the 4.8
staged code) can be easily selected on a per DM multipath device table
by adding either: queue_mode=bio or queue_mode=mq

(made possible with this commit:
https://git.kernel.org/cgit/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=dm-4.8&id=e83068a5faafb8ca65d3b58bd1e1e3959ce1ddce
)

> 2. nvme was blacklisted.
> 
> I added below quick hack to just make it work.
> 
> root at host:~# cat /proc/partitions
> 
>  259        0  937692504 nvme0n1
>  252        0  937692504 dm-0
>  259        1  937692504 nvme1n1
> 
> diff --git a/libmultipath/blacklist.c b/libmultipath/blacklist.c
> index 2400eda..a143383 100644
> --- a/libmultipath/blacklist.c
> +++ b/libmultipath/blacklist.c
> @@ -190,9 +190,11 @@ setup_default_blist (struct config * conf)
>  	if (store_ble(conf->blist_devnode, str, ORIGIN_DEFAULT))
>  		return 1;
>  
> +#if 0
>  	str = STRDUP("^nvme.*");
>  	if (!str)
>  		return 1;
> +#endif
>  	if (store_ble(conf->blist_devnode, str, ORIGIN_DEFAULT))
>  		return 1;

That's weird, not sure why that'd be the case.. maybe because NVMeoF
hasn't been worked through to "just work" with multipath-tools
yet.. Ben? Hannes?

> diff --git a/multipathd/main.c b/multipathd/main.c
> index c0ca571..1364070 100644
> --- a/multipathd/main.c
> +++ b/multipathd/main.c
> @@ -1012,6 +1012,7 @@ uxsock_trigger (char * str, char ** reply, int * len, void * trigger_data)
>  static int
>  uev_discard(char * devpath)
>  {
> +#if 0
>  	char *tmp;
>  	char a[11], b[11];
>  
> @@ -1028,6 +1029,7 @@ uev_discard(char * devpath)
>  		condlog(4, "discard event on %s", devpath);
>  		return 1;
>  	}
> +#endif
>  	return 0;
>  }

Why did you have to comment out this discard code?

  parent reply	other threads:[~2016-06-30 22:52 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-30 21:08 NVMeoF multi-path setup Ming Lin
2016-06-30 21:08 ` Ming Lin
2016-06-30 21:57 ` Ming Lin
2016-06-30 21:57   ` Ming Lin
2016-06-30 22:19   ` Ming Lin
2016-06-30 22:19     ` Ming Lin
2016-06-30 22:52   ` Mike Snitzer [this message]
2016-06-30 22:52     ` Mike Snitzer
2016-06-30 22:57     ` Mike Snitzer
2016-06-30 22:57       ` Mike Snitzer
2016-06-30 23:14     ` Keith Busch
2016-06-30 23:14       ` Keith Busch
2016-07-13 10:19     ` Sagi Grimberg
2016-07-13 10:19       ` [dm-devel] " Sagi Grimberg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160630225207.GB22293@redhat.com \
    --to=snitzer@redhat.com \
    --cc=dm-devel@redhat.com \
    --cc=linux-nvme@lists.infradead.org \
    --cc=mlin@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.