public inbox for linux-ide@vger.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH] PCI: Add Intel remapped NVMe device support
       [not found]         ` <CAD8Lp47Vu=w+Lj77_vL05JYV1WMog9WX3FHGE+TseFrhcLoTuA@mail.gmail.com>
@ 2019-06-18  7:46           ` Hannes Reinecke
  2019-06-18  8:06             ` Daniel Drake
       [not found]           ` <CAOSXXT4Ba_6xRUyaQBxpq+zdG9_itXDhFJ5EFZPv3CQuJZKHzg@mail.gmail.com>
  1 sibling, 1 reply; 5+ messages in thread
From: Hannes Reinecke @ 2019-06-18  7:46 UTC (permalink / raw)
  To: Daniel Drake, Christoph Hellwig
  Cc: Jens Axboe, Sagi Grimberg, Linux PCI, Keith Busch, Keith Busch,
	linux-ide, linux-nvme, Bjorn Helgaas, Linux Upstreaming Team

On 6/14/19 4:26 AM, Daniel Drake wrote:
> On Thu, Jun 13, 2019 at 4:54 PM Christoph Hellwig <hch@lst.de> wrote:
>> So until we get very clear and good documentation from Intel on that
>> I don't think any form of upstream support will fly.  And given that
>> Dan who submitted the original patch can't even talk about this thing
>> any more and apparently got a gag order doesn't really give me confidence
>> any of this will ever work.
> 
> I realise the architecture here seems badly thought out, and the lack
> of a decent spec makes the situation worse, but I'd encourage you to
> reconsider this from the perspectives of:
>  - Are the patches really more ugly than the underlying architecture?
>  - We strive to make Linux work well on common platforms and sometimes
> have to accept that hardware vendors do questionable things & do not
> fully cooperate
>  - It works out of the box on Windows
> 
Actually, there _is_ a register description:

https://www.intel.com/content/dam/www/public/us/en/documents/datasheets/300-series-chipset-pch-datasheet-vol-2.pdf

Look for section 15: Intel RST for PCIe Storage.

That gives you a reasonable description of the various registers etc.
You'll have to translate from Intel-speak, but that should be manageable.

In general, I am _quite_ in favour of having Linux Support for these
kind of devices.
I fully agree the interface is ugly, and badly thought out.
But these devices exist and have been sold for quite some time now, so
there is no way we can influence the design on those boxes.

And we really have come a long way from the the original Linux idea of
"hey, that's weird hardware, let's hack together a driver for it" to the
flat rejection of hardware we don't like.
I, for one, prefer the old way.

Looking at the spec I think that Daniels approach of exposing an
additional PCI device is the correct way of going about it (as the
interface design can be thought of a messed-up SR-IOV interface), but
I'll defer to Bjorn here.
But we really should have a driver to get these boxes rolling.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Mary Higgins, Sri Rasiah
HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] PCI: Add Intel remapped NVMe device support
       [not found]               ` <CAOSXXT7eV4SBSkoKoOKAPaUQxczrD3rAvpz=12LTQenvRjCYRw@mail.gmail.com>
@ 2019-06-18  7:48                 ` Hannes Reinecke
  0 siblings, 0 replies; 5+ messages in thread
From: Hannes Reinecke @ 2019-06-18  7:48 UTC (permalink / raw)
  To: Keith Busch, Bjorn Helgaas
  Cc: Jens Axboe, Sagi Grimberg, Linux PCI, linux-nvme, Daniel Drake,
	linux-ide, Keith Busch, Linux Upstreaming Team, Christoph Hellwig

On 6/14/19 11:05 PM, Keith Busch wrote:
> On Fri, Jun 14, 2019 at 2:06 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
>> Ugh.  Are you saying the installation instructions for Linux will say
>> "change the BIOS setting to AHCI"?  That's an unpleasant user
>> experience, especially if the installation fails if the user hasn't
>> read the instructions.
> 
> :) That is essentially what the dmesg currently says when Linux
> detects this, from drivers/ata/ahci.c:ahci_remap_check()
> 
>   "Switch your BIOS from RAID to AHCI mode to use them (remapped NVMe devices)"
> 
> Unpleasant yes, but what's the lesser of two evils? Recommend the
> supported full featured mode, or try to support with degraded
> capabilities?
> 
... and we already had several customer calls because of this.
Quite tedious, and completely unneccesary IMO.
The time for having the interface changed has passed, and we have to
live with it.
I still do prefer to have a driver for it, just to smooth the initial
user experience.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Mary Higgins, Sri Rasiah
HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] PCI: Add Intel remapped NVMe device support
  2019-06-18  7:46           ` [PATCH] PCI: Add Intel remapped NVMe device support Hannes Reinecke
@ 2019-06-18  8:06             ` Daniel Drake
  2019-06-18 15:15               ` Hannes Reinecke
  0 siblings, 1 reply; 5+ messages in thread
From: Daniel Drake @ 2019-06-18  8:06 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Christoph Hellwig, Jens Axboe, Sagi Grimberg, Linux PCI,
	Keith Busch, Keith Busch, linux-ide, linux-nvme, Bjorn Helgaas,
	Linux Upstreaming Team

On Tue, Jun 18, 2019 at 3:46 PM Hannes Reinecke <hare@suse.de> wrote:
> On 6/14/19 4:26 AM, Daniel Drake wrote:
> > On Thu, Jun 13, 2019 at 4:54 PM Christoph Hellwig <hch@lst.de> wrote:
> >> So until we get very clear and good documentation from Intel on that
> >> I don't think any form of upstream support will fly.  And given that
> >> Dan who submitted the original patch can't even talk about this thing
> >> any more and apparently got a gag order doesn't really give me confidence
> >> any of this will ever work.
> >
> > I realise the architecture here seems badly thought out, and the lack
> > of a decent spec makes the situation worse, but I'd encourage you to
> > reconsider this from the perspectives of:
> >  - Are the patches really more ugly than the underlying architecture?
> >  - We strive to make Linux work well on common platforms and sometimes
> > have to accept that hardware vendors do questionable things & do not
> > fully cooperate
> >  - It works out of the box on Windows
> >
> Actually, there _is_ a register description:
>
> https://www.intel.com/content/dam/www/public/us/en/documents/datasheets/300-series-chipset-pch-datasheet-vol-2.pdf
>
> Look for section 15: Intel RST for PCIe Storage.
>
> That gives you a reasonable description of the various registers etc.

Thanks for your email! I also spotted it for the first time earlier today.

Section 15 there (D24:F0) describes a special/hidden PCI device which
I can't figure out how to access from Linux (I believe it would be at
D18:F0 in the cases where the 300 PCH is integrated into the SoC, as
it is on the Whiskey Lake platform I have here). That's probably not
important because if even if we had access all the values are probably
read-only, as the BIOS will lock them all down during early boot, as
is documented. But the docs give some interesting insights into the
design.

Section 15.2 is potentially more relevant, as it describes registers
within the AHCI BAR and we do have access to that. Some of these
registers are already used by the current code to determine the
presence of remapped devices. It might be nice to use Device Memory
BAR Length (DMBL_1) but I can't figure out what is meant by "A 1 in
the bit location indicates the corresponding lower memory BAR bit for
the PCIe SSD device is a Read/Write (RW) bit." The value is 0x3fff on
the platform I have here.

We can probably also use these registers for MSI support. I started to
experiment, doesn't quite work but I'll keep poking. The doc suggests
there is a single MSI-X vector for the AHCI SATA device, and AHCI
MSI-X Starting Vector (AMXV) has value 0x140 on this platform. No idea
how to interpret that value. From experimentation, the AHCI SATA disk
generates interrupts on vector 0.

Then there are multiple vectors for the remapped NVMe devices. Device
MSI-X Configuration (DMXC_L_1) is set up to assign vectors 1 to 19 to
NVMe on this platform. But it says "This field is only valid when
DMXC.ID indicates interrupt delivery using MSI-X" but what/where is
DMXC.ID? So far I can get NVMe-related interrupts on vector 1 but
apparently not enough of them, the driver hangs during probe.

I've nearly finished refreshing & extending Dan Williams' patches and
will send them for more discussion soon.

Daniel

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] PCI: Add Intel remapped NVMe device support
  2019-06-18  8:06             ` Daniel Drake
@ 2019-06-18 15:15               ` Hannes Reinecke
  2019-06-19 13:52                 ` Bjorn Helgaas
  0 siblings, 1 reply; 5+ messages in thread
From: Hannes Reinecke @ 2019-06-18 15:15 UTC (permalink / raw)
  To: Daniel Drake
  Cc: Jens Axboe, Sagi Grimberg, Linux PCI, Linux Upstreaming Team,
	Keith Busch, linux-ide, linux-nvme, Keith Busch, Bjorn Helgaas,
	Christoph Hellwig

On 6/18/19 10:06 AM, Daniel Drake wrote:
> On Tue, Jun 18, 2019 at 3:46 PM Hannes Reinecke <hare@suse.de> wrote:
>> On 6/14/19 4:26 AM, Daniel Drake wrote:
>>> On Thu, Jun 13, 2019 at 4:54 PM Christoph Hellwig <hch@lst.de> wrote:
>>>> So until we get very clear and good documentation from Intel on that
>>>> I don't think any form of upstream support will fly.  And given that
>>>> Dan who submitted the original patch can't even talk about this thing
>>>> any more and apparently got a gag order doesn't really give me confidence
>>>> any of this will ever work.
>>>
>>> I realise the architecture here seems badly thought out, and the lack
>>> of a decent spec makes the situation worse, but I'd encourage you to
>>> reconsider this from the perspectives of:
>>>  - Are the patches really more ugly than the underlying architecture?
>>>  - We strive to make Linux work well on common platforms and sometimes
>>> have to accept that hardware vendors do questionable things & do not
>>> fully cooperate
>>>  - It works out of the box on Windows
>>>
>> Actually, there _is_ a register description:
>>
>> https://www.intel.com/content/dam/www/public/us/en/documents/datasheets/300-series-chipset-pch-datasheet-vol-2.pdf
>>
>> Look for section 15: Intel RST for PCIe Storage.
>>
>> That gives you a reasonable description of the various registers etc.
> 
> Thanks for your email! I also spotted it for the first time earlier today.
> 
> Section 15 there (D24:F0) describes a special/hidden PCI device which
> I can't figure out how to access from Linux (I believe it would be at
> D18:F0 in the cases where the 300 PCH is integrated into the SoC, as
> it is on the Whiskey Lake platform I have here). That's probably not
> important because if even if we had access all the values are probably
> read-only, as the BIOS will lock them all down during early boot, as
> is documented. But the docs give some interesting insights into the
> design.
> 
> Section 15.2 is potentially more relevant, as it describes registers
> within the AHCI BAR and we do have access to that. Some of these
> registers are already used by the current code to determine the
> presence of remapped devices. It might be nice to use Device Memory
> BAR Length (DMBL_1) but I can't figure out what is meant by "A 1 in
> the bit location indicates the corresponding lower memory BAR bit for
> the PCIe SSD device is a Read/Write (RW) bit." The value is 0x3fff on
> the platform I have here.
> 
> We can probably also use these registers for MSI support. I started to
> experiment, doesn't quite work but I'll keep poking. The doc suggests
> there is a single MSI-X vector for the AHCI SATA device, and AHCI
> MSI-X Starting Vector (AMXV) has value 0x140 on this platform. No idea
> how to interpret that value. From experimentation, the AHCI SATA disk
> generates interrupts on vector 0.
> 
The 0x140 is probably the offset into the PCI config space where the
AHCI MSI-X vector table can be found ...

> Then there are multiple vectors for the remapped NVMe devices. Device
> MSI-X Configuration (DMXC_L_1) is set up to assign vectors 1 to 19 to
> NVMe on this platform. But it says "This field is only valid when
> DMXC.ID indicates interrupt delivery using MSI-X" but what/where is
> DMXC.ID? So far I can get NVMe-related interrupts on vector 1 but
> apparently not enough of them, the driver hangs during probe.
> 
I _think_ the problem here is the automatic interrupt affinity we're
doing; we probably have to exclude the AHCI vector when setting up
interrupt affinity.

> I've nearly finished refreshing & extending Dan Williams' patches and
> will send them for more discussion soon.
> 
THX.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		   Teamlead Storage & Networking
hare@suse.de			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Mary Higgins, Sri Rasiah
HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] PCI: Add Intel remapped NVMe device support
  2019-06-18 15:15               ` Hannes Reinecke
@ 2019-06-19 13:52                 ` Bjorn Helgaas
  0 siblings, 0 replies; 5+ messages in thread
From: Bjorn Helgaas @ 2019-06-19 13:52 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Daniel Drake, Jens Axboe, Sagi Grimberg, Linux PCI, Keith Busch,
	linux-ide, linux-nvme, Keith Busch, Linux Upstreaming Team,
	Christoph Hellwig

On Tue, Jun 18, 2019 at 05:15:52PM +0200, Hannes Reinecke wrote:
> On 6/18/19 10:06 AM, Daniel Drake wrote:

> > We can probably also use these registers for MSI support. I
> > started to experiment, doesn't quite work but I'll keep poking.
> > The doc suggests there is a single MSI-X vector for the AHCI SATA
> > device, and AHCI MSI-X Starting Vector (AMXV) has value 0x140 on
> > this platform. No idea how to interpret that value. From
> > experimentation, the AHCI SATA disk generates interrupts on vector
> > 0.
> > 
> The 0x140 is probably the offset into the PCI config space where the
> AHCI MSI-X vector table can be found ...

An MSI-X vector table is in memory space, not config space.  You'd
have to look at PCI_MSIX_TABLE_BIR to find which BAR maps it, and then
add PCI_MSIX_TABLE_OFFSET to the BAR value.

Bjorn

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-06-19 13:52 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20190610074456.2761-1-drake@endlessm.com>
     [not found] ` <CAOSXXT7OFzHeTxNqZ1sS6giRxhDcrUUnVjURWBiFUc5T_8p=MA@mail.gmail.com>
     [not found]   ` <CAD8Lp45djPU_Ur8uCO2Y5Sbek_5N9QKkxLXdKNVcvkr6rFPLUQ@mail.gmail.com>
     [not found]     ` <CAOSXXT7H6HxY-za66Tr9ybRQyHsTdwwAgk9O2F=xK42MT8HsuA@mail.gmail.com>
     [not found]       ` <20190613085402.GC13442@lst.de>
     [not found]         ` <CAD8Lp47Vu=w+Lj77_vL05JYV1WMog9WX3FHGE+TseFrhcLoTuA@mail.gmail.com>
2019-06-18  7:46           ` [PATCH] PCI: Add Intel remapped NVMe device support Hannes Reinecke
2019-06-18  8:06             ` Daniel Drake
2019-06-18 15:15               ` Hannes Reinecke
2019-06-19 13:52                 ` Bjorn Helgaas
     [not found]           ` <CAOSXXT4Ba_6xRUyaQBxpq+zdG9_itXDhFJ5EFZPv3CQuJZKHzg@mail.gmail.com>
     [not found]             ` <20190614200557.GS13533@google.com>
     [not found]               ` <CAOSXXT7eV4SBSkoKoOKAPaUQxczrD3rAvpz=12LTQenvRjCYRw@mail.gmail.com>
2019-06-18  7:48                 ` Hannes Reinecke

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox