* Re: [PATCH] PCI: Add Intel remapped NVMe device support [not found] ` <CAD8Lp47Vu=w+Lj77_vL05JYV1WMog9WX3FHGE+TseFrhcLoTuA@mail.gmail.com> @ 2019-06-18 7:46 ` Hannes Reinecke 2019-06-18 8:06 ` Daniel Drake [not found] ` <CAOSXXT4Ba_6xRUyaQBxpq+zdG9_itXDhFJ5EFZPv3CQuJZKHzg@mail.gmail.com> 1 sibling, 1 reply; 5+ messages in thread From: Hannes Reinecke @ 2019-06-18 7:46 UTC (permalink / raw) To: Daniel Drake, Christoph Hellwig Cc: Jens Axboe, Sagi Grimberg, Linux PCI, Keith Busch, Keith Busch, linux-ide, linux-nvme, Bjorn Helgaas, Linux Upstreaming Team On 6/14/19 4:26 AM, Daniel Drake wrote: > On Thu, Jun 13, 2019 at 4:54 PM Christoph Hellwig <hch@lst.de> wrote: >> So until we get very clear and good documentation from Intel on that >> I don't think any form of upstream support will fly. And given that >> Dan who submitted the original patch can't even talk about this thing >> any more and apparently got a gag order doesn't really give me confidence >> any of this will ever work. > > I realise the architecture here seems badly thought out, and the lack > of a decent spec makes the situation worse, but I'd encourage you to > reconsider this from the perspectives of: > - Are the patches really more ugly than the underlying architecture? > - We strive to make Linux work well on common platforms and sometimes > have to accept that hardware vendors do questionable things & do not > fully cooperate > - It works out of the box on Windows > Actually, there _is_ a register description: https://www.intel.com/content/dam/www/public/us/en/documents/datasheets/300-series-chipset-pch-datasheet-vol-2.pdf Look for section 15: Intel RST for PCIe Storage. That gives you a reasonable description of the various registers etc. You'll have to translate from Intel-speak, but that should be manageable. In general, I am _quite_ in favour of having Linux Support for these kind of devices. I fully agree the interface is ugly, and badly thought out. But these devices exist and have been sold for quite some time now, so there is no way we can influence the design on those boxes. And we really have come a long way from the the original Linux idea of "hey, that's weird hardware, let's hack together a driver for it" to the flat rejection of hardware we don't like. I, for one, prefer the old way. Looking at the spec I think that Daniels approach of exposing an additional PCI device is the correct way of going about it (as the interface design can be thought of a messed-up SR-IOV interface), but I'll defer to Bjorn here. But we really should have a driver to get these boxes rolling. Cheers, Hannes -- Dr. Hannes Reinecke Teamlead Storage & Networking hare@suse.de +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Felix Imendörffer, Mary Higgins, Sri Rasiah HRB 21284 (AG Nürnberg) ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] PCI: Add Intel remapped NVMe device support 2019-06-18 7:46 ` [PATCH] PCI: Add Intel remapped NVMe device support Hannes Reinecke @ 2019-06-18 8:06 ` Daniel Drake 2019-06-18 15:15 ` Hannes Reinecke 0 siblings, 1 reply; 5+ messages in thread From: Daniel Drake @ 2019-06-18 8:06 UTC (permalink / raw) To: Hannes Reinecke Cc: Christoph Hellwig, Jens Axboe, Sagi Grimberg, Linux PCI, Keith Busch, Keith Busch, linux-ide, linux-nvme, Bjorn Helgaas, Linux Upstreaming Team On Tue, Jun 18, 2019 at 3:46 PM Hannes Reinecke <hare@suse.de> wrote: > On 6/14/19 4:26 AM, Daniel Drake wrote: > > On Thu, Jun 13, 2019 at 4:54 PM Christoph Hellwig <hch@lst.de> wrote: > >> So until we get very clear and good documentation from Intel on that > >> I don't think any form of upstream support will fly. And given that > >> Dan who submitted the original patch can't even talk about this thing > >> any more and apparently got a gag order doesn't really give me confidence > >> any of this will ever work. > > > > I realise the architecture here seems badly thought out, and the lack > > of a decent spec makes the situation worse, but I'd encourage you to > > reconsider this from the perspectives of: > > - Are the patches really more ugly than the underlying architecture? > > - We strive to make Linux work well on common platforms and sometimes > > have to accept that hardware vendors do questionable things & do not > > fully cooperate > > - It works out of the box on Windows > > > Actually, there _is_ a register description: > > https://www.intel.com/content/dam/www/public/us/en/documents/datasheets/300-series-chipset-pch-datasheet-vol-2.pdf > > Look for section 15: Intel RST for PCIe Storage. > > That gives you a reasonable description of the various registers etc. Thanks for your email! I also spotted it for the first time earlier today. Section 15 there (D24:F0) describes a special/hidden PCI device which I can't figure out how to access from Linux (I believe it would be at D18:F0 in the cases where the 300 PCH is integrated into the SoC, as it is on the Whiskey Lake platform I have here). That's probably not important because if even if we had access all the values are probably read-only, as the BIOS will lock them all down during early boot, as is documented. But the docs give some interesting insights into the design. Section 15.2 is potentially more relevant, as it describes registers within the AHCI BAR and we do have access to that. Some of these registers are already used by the current code to determine the presence of remapped devices. It might be nice to use Device Memory BAR Length (DMBL_1) but I can't figure out what is meant by "A 1 in the bit location indicates the corresponding lower memory BAR bit for the PCIe SSD device is a Read/Write (RW) bit." The value is 0x3fff on the platform I have here. We can probably also use these registers for MSI support. I started to experiment, doesn't quite work but I'll keep poking. The doc suggests there is a single MSI-X vector for the AHCI SATA device, and AHCI MSI-X Starting Vector (AMXV) has value 0x140 on this platform. No idea how to interpret that value. From experimentation, the AHCI SATA disk generates interrupts on vector 0. Then there are multiple vectors for the remapped NVMe devices. Device MSI-X Configuration (DMXC_L_1) is set up to assign vectors 1 to 19 to NVMe on this platform. But it says "This field is only valid when DMXC.ID indicates interrupt delivery using MSI-X" but what/where is DMXC.ID? So far I can get NVMe-related interrupts on vector 1 but apparently not enough of them, the driver hangs during probe. I've nearly finished refreshing & extending Dan Williams' patches and will send them for more discussion soon. Daniel ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] PCI: Add Intel remapped NVMe device support 2019-06-18 8:06 ` Daniel Drake @ 2019-06-18 15:15 ` Hannes Reinecke 2019-06-19 13:52 ` Bjorn Helgaas 0 siblings, 1 reply; 5+ messages in thread From: Hannes Reinecke @ 2019-06-18 15:15 UTC (permalink / raw) To: Daniel Drake Cc: Jens Axboe, Sagi Grimberg, Linux PCI, Linux Upstreaming Team, Keith Busch, linux-ide, linux-nvme, Keith Busch, Bjorn Helgaas, Christoph Hellwig On 6/18/19 10:06 AM, Daniel Drake wrote: > On Tue, Jun 18, 2019 at 3:46 PM Hannes Reinecke <hare@suse.de> wrote: >> On 6/14/19 4:26 AM, Daniel Drake wrote: >>> On Thu, Jun 13, 2019 at 4:54 PM Christoph Hellwig <hch@lst.de> wrote: >>>> So until we get very clear and good documentation from Intel on that >>>> I don't think any form of upstream support will fly. And given that >>>> Dan who submitted the original patch can't even talk about this thing >>>> any more and apparently got a gag order doesn't really give me confidence >>>> any of this will ever work. >>> >>> I realise the architecture here seems badly thought out, and the lack >>> of a decent spec makes the situation worse, but I'd encourage you to >>> reconsider this from the perspectives of: >>> - Are the patches really more ugly than the underlying architecture? >>> - We strive to make Linux work well on common platforms and sometimes >>> have to accept that hardware vendors do questionable things & do not >>> fully cooperate >>> - It works out of the box on Windows >>> >> Actually, there _is_ a register description: >> >> https://www.intel.com/content/dam/www/public/us/en/documents/datasheets/300-series-chipset-pch-datasheet-vol-2.pdf >> >> Look for section 15: Intel RST for PCIe Storage. >> >> That gives you a reasonable description of the various registers etc. > > Thanks for your email! I also spotted it for the first time earlier today. > > Section 15 there (D24:F0) describes a special/hidden PCI device which > I can't figure out how to access from Linux (I believe it would be at > D18:F0 in the cases where the 300 PCH is integrated into the SoC, as > it is on the Whiskey Lake platform I have here). That's probably not > important because if even if we had access all the values are probably > read-only, as the BIOS will lock them all down during early boot, as > is documented. But the docs give some interesting insights into the > design. > > Section 15.2 is potentially more relevant, as it describes registers > within the AHCI BAR and we do have access to that. Some of these > registers are already used by the current code to determine the > presence of remapped devices. It might be nice to use Device Memory > BAR Length (DMBL_1) but I can't figure out what is meant by "A 1 in > the bit location indicates the corresponding lower memory BAR bit for > the PCIe SSD device is a Read/Write (RW) bit." The value is 0x3fff on > the platform I have here. > > We can probably also use these registers for MSI support. I started to > experiment, doesn't quite work but I'll keep poking. The doc suggests > there is a single MSI-X vector for the AHCI SATA device, and AHCI > MSI-X Starting Vector (AMXV) has value 0x140 on this platform. No idea > how to interpret that value. From experimentation, the AHCI SATA disk > generates interrupts on vector 0. > The 0x140 is probably the offset into the PCI config space where the AHCI MSI-X vector table can be found ... > Then there are multiple vectors for the remapped NVMe devices. Device > MSI-X Configuration (DMXC_L_1) is set up to assign vectors 1 to 19 to > NVMe on this platform. But it says "This field is only valid when > DMXC.ID indicates interrupt delivery using MSI-X" but what/where is > DMXC.ID? So far I can get NVMe-related interrupts on vector 1 but > apparently not enough of them, the driver hangs during probe. > I _think_ the problem here is the automatic interrupt affinity we're doing; we probably have to exclude the AHCI vector when setting up interrupt affinity. > I've nearly finished refreshing & extending Dan Williams' patches and > will send them for more discussion soon. > THX. Cheers, Hannes -- Dr. Hannes Reinecke Teamlead Storage & Networking hare@suse.de +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Felix Imendörffer, Mary Higgins, Sri Rasiah HRB 21284 (AG Nürnberg) ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] PCI: Add Intel remapped NVMe device support 2019-06-18 15:15 ` Hannes Reinecke @ 2019-06-19 13:52 ` Bjorn Helgaas 0 siblings, 0 replies; 5+ messages in thread From: Bjorn Helgaas @ 2019-06-19 13:52 UTC (permalink / raw) To: Hannes Reinecke Cc: Daniel Drake, Jens Axboe, Sagi Grimberg, Linux PCI, Keith Busch, linux-ide, linux-nvme, Keith Busch, Linux Upstreaming Team, Christoph Hellwig On Tue, Jun 18, 2019 at 05:15:52PM +0200, Hannes Reinecke wrote: > On 6/18/19 10:06 AM, Daniel Drake wrote: > > We can probably also use these registers for MSI support. I > > started to experiment, doesn't quite work but I'll keep poking. > > The doc suggests there is a single MSI-X vector for the AHCI SATA > > device, and AHCI MSI-X Starting Vector (AMXV) has value 0x140 on > > this platform. No idea how to interpret that value. From > > experimentation, the AHCI SATA disk generates interrupts on vector > > 0. > > > The 0x140 is probably the offset into the PCI config space where the > AHCI MSI-X vector table can be found ... An MSI-X vector table is in memory space, not config space. You'd have to look at PCI_MSIX_TABLE_BIR to find which BAR maps it, and then add PCI_MSIX_TABLE_OFFSET to the BAR value. Bjorn ^ permalink raw reply [flat|nested] 5+ messages in thread
[parent not found: <CAOSXXT4Ba_6xRUyaQBxpq+zdG9_itXDhFJ5EFZPv3CQuJZKHzg@mail.gmail.com>]
[parent not found: <20190614200557.GS13533@google.com>]
[parent not found: <CAOSXXT7eV4SBSkoKoOKAPaUQxczrD3rAvpz=12LTQenvRjCYRw@mail.gmail.com>]
* Re: [PATCH] PCI: Add Intel remapped NVMe device support [not found] ` <CAOSXXT7eV4SBSkoKoOKAPaUQxczrD3rAvpz=12LTQenvRjCYRw@mail.gmail.com> @ 2019-06-18 7:48 ` Hannes Reinecke 0 siblings, 0 replies; 5+ messages in thread From: Hannes Reinecke @ 2019-06-18 7:48 UTC (permalink / raw) To: Keith Busch, Bjorn Helgaas Cc: Jens Axboe, Sagi Grimberg, Linux PCI, linux-nvme, Daniel Drake, linux-ide, Keith Busch, Linux Upstreaming Team, Christoph Hellwig On 6/14/19 11:05 PM, Keith Busch wrote: > On Fri, Jun 14, 2019 at 2:06 PM Bjorn Helgaas <helgaas@kernel.org> wrote: >> Ugh. Are you saying the installation instructions for Linux will say >> "change the BIOS setting to AHCI"? That's an unpleasant user >> experience, especially if the installation fails if the user hasn't >> read the instructions. > > :) That is essentially what the dmesg currently says when Linux > detects this, from drivers/ata/ahci.c:ahci_remap_check() > > "Switch your BIOS from RAID to AHCI mode to use them (remapped NVMe devices)" > > Unpleasant yes, but what's the lesser of two evils? Recommend the > supported full featured mode, or try to support with degraded > capabilities? > ... and we already had several customer calls because of this. Quite tedious, and completely unneccesary IMO. The time for having the interface changed has passed, and we have to live with it. I still do prefer to have a driver for it, just to smooth the initial user experience. Cheers, Hannes -- Dr. Hannes Reinecke Teamlead Storage & Networking hare@suse.de +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Felix Imendörffer, Mary Higgins, Sri Rasiah HRB 21284 (AG Nürnberg) ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2019-06-19 13:52 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20190610074456.2761-1-drake@endlessm.com>
[not found] ` <CAOSXXT7OFzHeTxNqZ1sS6giRxhDcrUUnVjURWBiFUc5T_8p=MA@mail.gmail.com>
[not found] ` <CAD8Lp45djPU_Ur8uCO2Y5Sbek_5N9QKkxLXdKNVcvkr6rFPLUQ@mail.gmail.com>
[not found] ` <CAOSXXT7H6HxY-za66Tr9ybRQyHsTdwwAgk9O2F=xK42MT8HsuA@mail.gmail.com>
[not found] ` <20190613085402.GC13442@lst.de>
[not found] ` <CAD8Lp47Vu=w+Lj77_vL05JYV1WMog9WX3FHGE+TseFrhcLoTuA@mail.gmail.com>
2019-06-18 7:46 ` [PATCH] PCI: Add Intel remapped NVMe device support Hannes Reinecke
2019-06-18 8:06 ` Daniel Drake
2019-06-18 15:15 ` Hannes Reinecke
2019-06-19 13:52 ` Bjorn Helgaas
[not found] ` <CAOSXXT4Ba_6xRUyaQBxpq+zdG9_itXDhFJ5EFZPv3CQuJZKHzg@mail.gmail.com>
[not found] ` <20190614200557.GS13533@google.com>
[not found] ` <CAOSXXT7eV4SBSkoKoOKAPaUQxczrD3rAvpz=12LTQenvRjCYRw@mail.gmail.com>
2019-06-18 7:48 ` Hannes Reinecke
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox