All of lore.kernel.org
 help / color / mirror / Atom feed
From: Hannes Reinecke <hare@suse.de>
To: Christoph Hellwig <hch@lst.de>
Cc: Keith Busch <keith.busch@wdc.com>,
	Sagi Grimberg <sagi@grimberg.me>,
	linux-nvme@lists.infradead.org
Subject: Re: [PATCH 2/2] nvme: add 'queue_if_no_path' semantics
Date: Tue, 6 Oct 2020 15:45:01 +0200	[thread overview]
Message-ID: <ce2c93e1-ba38-cebb-33b3-d506116a61aa@suse.de> (raw)
In-Reply-To: <00e75643-d422-ca12-1648-02ca89044182@suse.de>

On 10/6/20 3:30 PM, Hannes Reinecke wrote:
> On 10/6/20 10:39 AM, Christoph Hellwig wrote:
>> On Tue, Oct 06, 2020 at 10:29:49AM +0200, Hannes Reinecke wrote:
>>>> All multipath devices should behave the same.  No special casing for
>>>> PCIe, please.
>>>>
>>> Even if the default behaviour breaks PCI hotplug?
>>
>> Why would it "break" PCI hotplug?
>>
> When running under MD RAID:
> Before hotplug:
> # nvme list
> Node             SN                   Model       Namespace 
> Usage                      Format           FW Rev
> ---------------- -------------------- 
> ---------------------------------------- --------- 
> -------------------------- ---------------- --------
> /dev/nvme0n1     SLESNVME1            QEMU NVMe Ctrl       1          
> 17.18  GB /  17.18  GB    512   B +  0 B   1.0
> /dev/nvme1n1     SLESNVME2            QEMU NVMe Ctrl       1           
> 4.29  GB /   4.29  GB    512   B +  0 B   1.0
> /dev/nvme2n1     SLESNVME3            QEMU NVMe Ctrl       1           
> 4.29  GB /   4.29  GB    512   B +  0 B   1.0
> After hotplug:
> 
> # nvme list
> Node             SN                   Model       Namespace 
> Usage                      Format           FW Rev
> ---------------- -------------------- 
> ---------------------------------------- --------- 
> -------------------------- ---------------- --------
> /dev/nvme0n1     SLESNVME1            QEMU NVMe Ctrl       1          
> 17.18  GB /  17.18  GB    512   B +  0 B   1.0
> /dev/nvme1n1     SLESNVME2            QEMU NVMe Ctrl       -1          
> 0.00   B /   0.00   B      1   B +  0 B   1.0
> /dev/nvme1n2     SLESNVME2            QEMU NVMe Ctrl       1           
> 4.29  GB /   4.29  GB    512   B +  0 B   1.0
> /dev/nvme2n1     SLESNVME3            QEMU NVMe Ctrl       1           
> 4.29  GB /   4.29  GB    512   B +  0 B   1.0
> 
> And MD hasn't been notified that the device is gone:
> # cat /proc/mdstat
> Personalities : [raid10]
> md0 : active raid10 nvme2n1[1] nvme1n1[0]
>        4189184 blocks super 1.2 2 near-copies [2/2] [UU]
>        bitmap: 0/1 pages [0KB], 65536KB chunk
> 
> unused devices: <none>
> 
> Once I do some I/O to it MD recognized a faulty device:
> 
> # cat /proc/mdstat
> Personalities : [raid10]
> md0 : active raid10 nvme2n1[1] nvme1n1[0](F)
>        4189184 blocks super 1.2 2 near-copies [2/1] [_U]
>        bitmap: 0/1 pages [0KB], 65536KB chunk
> 
> unused devices: <none>
> 
> but the re-added device isn't added to the MD RAID.
> In fact, it has been assigned a _different_ namespace ID:
> 
> [  904.299065] pcieport 0000:00:08.0: pciehp: Slot(0-1): Card present
> [  904.299067] pcieport 0000:00:08.0: pciehp: Slot(0-1): Link Up
> [  904.435314] pci 0000:02:00.0: [8086:5845] type 00 class 0x010802
> [  904.435523] pci 0000:02:00.0: reg 0x10: [mem 0x00000000-0x00001fff 
> 64bit]
> [  904.435676] pci 0000:02:00.0: reg 0x20: [mem 0x00000000-0x00000fff]
> [  904.436982] pci 0000:02:00.0: BAR 0: assigned [mem 
> 0xc1200000-0xc1201fff 64bit]
> [  904.437086] pci 0000:02:00.0: BAR 4: assigned [mem 
> 0xc1202000-0xc1202fff]
> [  904.437118] pcieport 0000:00:08.0: PCI bridge to [bus 02]
> [  904.437137] pcieport 0000:00:08.0:   bridge window [io  0x7000-0x7fff]
> [  904.439024] pcieport 0000:00:08.0:   bridge window [mem 
> 0xc1200000-0xc13fffff]
> [  904.440229] pcieport 0000:00:08.0:   bridge window [mem 
> 0x802000000-0x803ffffff 64bit pref]
> [  904.447150] nvme nvme3: pci function 0000:02:00.0
> [  904.447487] nvme 0000:02:00.0: enabling device (0000 -> 0002)
> [  904.458880] nvme nvme3: 1/0/0 default/read/poll queues
> [  904.461296] nvme1n2: detected capacity change from 0 to 4294967296
> 
> and the 'old', pre-hotplug device still lingers on in the 'nvme list' 
> output.
> 
Compare that to the 'standard', non-CMIC nvme, where with the same setup 
MD would detach the nvme on its own:

# cat /proc/mdstat
Personalities : [raid10]
md127 : active (auto-read-only) raid10 nvme2n1[1]
       4189184 blocks super 1.2 2 near-copies [2/1] [_U]
       bitmap: 0/1 pages [0KB], 65536KB chunk

unused devices: <none>
# nvme list
Node             SN                   Model 
       Namespace Usage                      Format           FW Rev
---------------- -------------------- 
---------------------------------------- --------- 
-------------------------- ---------------- --------
/dev/nvme0n1     SLESNVME1            QEMU NVMe Ctrl 
       1          17.18  GB /  17.18  GB    512   B +  0 B   1.0
/dev/nvme2n1     SLESNVME3            QEMU NVMe Ctrl 
       1           4.29  GB /   4.29  GB    512   B +  0 B   1.0

And yes, this is exactly the same setup, the only difference being the 
CMIC setting for the NVMe device.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

  reply	other threads:[~2020-10-06 13:45 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-05 12:44 [RFC PATCHv3 0/2] nvme: queue_if_no_path functionality Hannes Reinecke
2020-10-05 12:44 ` [PATCH 1/2] nvme-mpath: delete disk after last connection Hannes Reinecke
2020-10-05 12:50   ` Christoph Hellwig
2021-03-05 20:06     ` Sagi Grimberg
2021-03-04 14:34   ` Daniel Wagner
2020-10-05 12:45 ` [PATCH 2/2] nvme: add 'queue_if_no_path' semantics Hannes Reinecke
2020-10-05 12:52   ` Christoph Hellwig
2020-10-06  5:48     ` Hannes Reinecke
2020-10-06  7:51       ` Christoph Hellwig
2020-10-06  8:07         ` Hannes Reinecke
2020-10-06  8:27           ` Christoph Hellwig
2020-10-06  8:29             ` Hannes Reinecke
2020-10-06  8:39               ` Christoph Hellwig
2020-10-06 13:30                 ` Hannes Reinecke
2020-10-06 13:45                   ` Hannes Reinecke [this message]
2021-03-05 20:31                     ` Sagi Grimberg
2021-03-08 13:17                       ` Hannes Reinecke
2021-03-15 17:21                         ` Sagi Grimberg
2020-10-06 17:41                   ` Keith Busch
2021-03-05 20:11                     ` Sagi Grimberg
2021-03-11 12:41                       ` Hannes Reinecke

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ce2c93e1-ba38-cebb-33b3-d506116a61aa@suse.de \
    --to=hare@suse.de \
    --cc=hch@lst.de \
    --cc=keith.busch@wdc.com \
    --cc=linux-nvme@lists.infradead.org \
    --cc=sagi@grimberg.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.