Device is ineligible for IOMMU domain attach due to platform RMRR requirement

public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed

* Device is ineligible for IOMMU domain attach due to platform RMRR requirement
@ 2015-03-06  5:20 Steven DuChene
  2015-03-06  6:10 ` Alex Williamson
  0 siblings, 1 reply; 6+ messages in thread
From: Steven DuChene @ 2015-03-06  5:20 UTC (permalink / raw)
  To: kvm

I am attempting on ubuntu 14.04 to configure PCI passthrough of a NVidia 
K40 GPU card that is plugged into a HP DL580 rack mounted server.
I have done all of the pre-work I normally have done in the past with 
pci-stub, vfio and etc but when I try an execute a qemu-system-x86_64 
command that works on a similar version of debian, I get the following 
error in the dmesg:

Device is ineligible for IOMMU domain attach due to platform RMRR 
requirement. Contact your platform vendor.

I have read through the patch description from Alex at:

http://lists.linuxfoundation.org/pipermail/iommu/2014-June/008816.html

and I have read the IOMMU documentation at:

https://www.kernel.org/doc/Documentation/Intel-IOMMU.txt

but I am still not really understanding if or what the fix is for this.

The ubuntu 14.04 system where I am getting this error is running 
3.16.0-30-generic
The debian system where I can do similar PCI passthrough of a NVidia K2 
GPU device is running a 3.14.29-4 kernel.

Can anyone provide any insight into an fix or workaround for this?

Thanks in advance.
--
Steven DuChene

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Device is ineligible for IOMMU domain attach due to platform RMRR requirement
  2015-03-06  5:20 Device is ineligible for IOMMU domain attach due to platform RMRR requirement Steven DuChene
@ 2015-03-06  6:10 ` Alex Williamson
  2015-03-07  3:10   ` Steven DuChene
  0 siblings, 1 reply; 6+ messages in thread
From: Alex Williamson @ 2015-03-06  6:10 UTC (permalink / raw)
  To: Steven DuChene; +Cc: kvm

On Fri, 2015-03-06 at 00:20 -0500, Steven DuChene wrote:
> I am attempting on ubuntu 14.04 to configure PCI passthrough of a NVidia 
> K40 GPU card that is plugged into a HP DL580 rack mounted server.
> I have done all of the pre-work I normally have done in the past with 
> pci-stub, vfio and etc but when I try an execute a qemu-system-x86_64 
> command that works on a similar version of debian, I get the following 
> error in the dmesg:
> 
> Device is ineligible for IOMMU domain attach due to platform RMRR 
> requirement. Contact your platform vendor.
> 
> I have read through the patch description from Alex at:
> 
> http://lists.linuxfoundation.org/pipermail/iommu/2014-June/008816.html
> 
> and I have read the IOMMU documentation at:
> 
> https://www.kernel.org/doc/Documentation/Intel-IOMMU.txt
> 
> but I am still not really understanding if or what the fix is for this.
> 
> The ubuntu 14.04 system where I am getting this error is running 
> 3.16.0-30-generic
> The debian system where I can do similar PCI passthrough of a NVidia K2 
> GPU device is running a 3.14.29-4 kernel.
> 
> Can anyone provide any insight into an fix or workaround for this?

Hi Steven,

The issue is that VT-d RMRRs are a platform imposed requirement that a
device continue to have identity mapped access to a platform defined
memory region at all times.  This requirement is fundamentally
incompatible with PCI device assignment where the address space of the
assigned device is defined by the VM.  The VT-d specification hints at
this restriction (8.4):

        The RMRR regions are expected to be used for legacy usages (such
        as USB, UMA Graphics, etc.) requiring reserved memory. Platform
        designers should avoid or limit use of reserved memory regions
        since these require system software to create holes in the DMA
        virtual address range available to system software and its
        drivers.

In order to support assignment of such devices and continue to honor the
RMRR, reserved memory regions would need to be imposed on the guest.
Doing this has a number of issues and it's not clear that it enables any
usable configurations due to the lack of isolation often implied by the
RMRRs.  RMRRs themselves imply some sort of communication conduit to the
platform, which it's also not clear should be allowed for a guest owned
device.

We also cannot continue the previous behavior of simply ignoring RMRRs
for assigned devices.  Not only does the platform require us to honor
them, failing to do so could have implication for both the platform and
the VM health and integrity.

As indicated by the dmesg warning, users encountering this problem
should contact their platform vendor, which is really the only course of
action that I can recommend.  Only the platform vendor can tell you why
they've imposed this requirement for the device and potentially offer a
remedy to remove that requirement.  KVM is not the first hypervisor to
impose this restriction for such devices.  The referenced patch was
tagged for stable, so you can expect that this change will eventually
trickle through all the distributions.  Sorry for the trouble, but it
really was a necessary change.  Thanks,

Alex

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Device is ineligible for IOMMU domain attach due to platform RMRR requirement
  2015-03-06  6:10 ` Alex Williamson
@ 2015-03-07  3:10   ` Steven DuChene
  2015-03-07  4:43     ` Alex Williamson
  0 siblings, 1 reply; 6+ messages in thread
From: Steven DuChene @ 2015-03-07  3:10 UTC (permalink / raw)
  To: Alex Williamson; +Cc: kvm

Alex:
Thanks for your quick reply and the information. One question though: 
When you say contact the platform vendor, are you talking about the 
vendor of the GPU card (NVidia) or the vendor of the system hardware 
(HP)? I.E. is the problem in the system BIOS/firmware or in the firmware 
of the GPU card?

This seems like this is going to be the death-knell of PCI passthrough 
as the likelihood of getting a system vendor to fix some obscure thing 
like this seems remote.

On 03/06/2015 01:10 AM, Alex Williamson wrote:
> On Fri, 2015-03-06 at 00:20 -0500, Steven DuChene wrote:
>> I am attempting on ubuntu 14.04 to configure PCI passthrough of a NVidia
>> K40 GPU card that is plugged into a HP DL580 rack mounted server.
>> I have done all of the pre-work I normally have done in the past with
>> pci-stub, vfio and etc but when I try an execute a qemu-system-x86_64
>> command that works on a similar version of debian, I get the following
>> error in the dmesg:
>>
>> Device is ineligible for IOMMU domain attach due to platform RMRR
>> requirement. Contact your platform vendor.
>>
>> I have read through the patch description from Alex at:
>>
>> http://lists.linuxfoundation.org/pipermail/iommu/2014-June/008816.html
>>
>> and I have read the IOMMU documentation at:
>>
>> https://www.kernel.org/doc/Documentation/Intel-IOMMU.txt
>>
>> but I am still not really understanding if or what the fix is for this.
>>
>> The ubuntu 14.04 system where I am getting this error is running
>> 3.16.0-30-generic
>> The debian system where I can do similar PCI passthrough of a NVidia K2
>> GPU device is running a 3.14.29-4 kernel.
>>
>> Can anyone provide any insight into an fix or workaround for this?
> Hi Steven,
>
> The issue is that VT-d RMRRs are a platform imposed requirement that a
> device continue to have identity mapped access to a platform defined
> memory region at all times.  This requirement is fundamentally
> incompatible with PCI device assignment where the address space of the
> assigned device is defined by the VM.  The VT-d specification hints at
> this restriction (8.4):
>
>          The RMRR regions are expected to be used for legacy usages (such
>          as USB, UMA Graphics, etc.) requiring reserved memory. Platform
>          designers should avoid or limit use of reserved memory regions
>          since these require system software to create holes in the DMA
>          virtual address range available to system software and its
>          drivers.
>
> In order to support assignment of such devices and continue to honor the
> RMRR, reserved memory regions would need to be imposed on the guest.
> Doing this has a number of issues and it's not clear that it enables any
> usable configurations due to the lack of isolation often implied by the
> RMRRs.  RMRRs themselves imply some sort of communication conduit to the
> platform, which it's also not clear should be allowed for a guest owned
> device.
>
> We also cannot continue the previous behavior of simply ignoring RMRRs
> for assigned devices.  Not only does the platform require us to honor
> them, failing to do so could have implication for both the platform and
> the VM health and integrity.
>
> As indicated by the dmesg warning, users encountering this problem
> should contact their platform vendor, which is really the only course of
> action that I can recommend.  Only the platform vendor can tell you why
> they've imposed this requirement for the device and potentially offer a
> remedy to remove that requirement.  KVM is not the first hypervisor to
> impose this restriction for such devices.  The referenced patch was
> tagged for stable, so you can expect that this change will eventually
> trickle through all the distributions.  Sorry for the trouble, but it
> really was a necessary change.  Thanks,
>
> Alex
>


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Device is ineligible for IOMMU domain attach due to platform RMRR requirement
  2015-03-07  3:10   ` Steven DuChene
@ 2015-03-07  4:43     ` Alex Williamson
  2015-03-07 10:13       ` Steven DuChene
  0 siblings, 1 reply; 6+ messages in thread
From: Alex Williamson @ 2015-03-07  4:43 UTC (permalink / raw)
  To: Steven DuChene; +Cc: kvm

On Fri, 2015-03-06 at 22:10 -0500, Steven DuChene wrote:
> Alex:
> Thanks for your quick reply and the information. One question though: 
> When you say contact the platform vendor, are you talking about the 
> vendor of the GPU card (NVidia) or the vendor of the system hardware 
> (HP)? I.E. is the problem in the system BIOS/firmware or in the firmware 
> of the GPU card?
> 
> This seems like this is going to be the death-knell of PCI passthrough 
> as the likelihood of getting a system vendor to fix some obscure thing 
> like this seems remote.

Hi Steven,

The problem is in the system firmware; the platform vendor in your case
is HP.  The issue is actually very limited.  Most platform vendors do
not make use of RMRRs beyond the recommendations of the VT-d spec.  This
limits RMRRs in the general case to a small set of devices that are not
generally used for PCI assignment anyway.  An exemption even exists for
RMRRs associated with USB devices since their usage is known to be
limited to early boot.  That effectively limits the scope for most
vendors to UMA graphics where PCI assignment does not yet work anyway.
I expect an exemption could also be added there once the RMRR usage is
discovered and documented.

In the case you've encountered, the RMRR usage is proprietary and we
cannot know the extent of ongoing usage.  We must therefore assume that
it is in use and that the RMRR requirement of the platform must be
honored.

Obviously our goal with this change is not to pick on any specific
vendor, but to restrict PCI assignment where it can be implemented
safely, both for the platform and the VM.  RMRRs present a restriction
in how the IOVA space for a device can be used that we cannot continue
to ignore and which presents implementation issues to support in a PCI
device assignment model.  HP engineers as well as the upstream community
have been consulted on this change and agreed to the restriction.  As I
said, KVM is not the first hypervisor to implement this restriction and
PCI assignment continues to be a valuable feature on those hypervisors.
Even on affected systems, RMRRs typically only apply to physical PCI
devices.  The vast majority of PCI assignment applications are used with
networking devices where SR-IOV is far more prevalent and where SR-IOV
virtual functions are typically unencumbered by RMRRs.

I believe this change is in the best interest of PCI assignment users,
the scope of affected systems is not as widespread as it might seem from
your perspective, and workarounds are often available for the most
common use case in the form of SR-IOV VFs.  Unfortunately we don't have
SR-IOV for Nvidia Tesla cards, so again, all I can offer is to contact
the platform vendor to see if there's any chance of a firmware update
that might remove this restriction.  Thanks,

Alex

> On 03/06/2015 01:10 AM, Alex Williamson wrote:
> > On Fri, 2015-03-06 at 00:20 -0500, Steven DuChene wrote:
> >> I am attempting on ubuntu 14.04 to configure PCI passthrough of a NVidia
> >> K40 GPU card that is plugged into a HP DL580 rack mounted server.
> >> I have done all of the pre-work I normally have done in the past with
> >> pci-stub, vfio and etc but when I try an execute a qemu-system-x86_64
> >> command that works on a similar version of debian, I get the following
> >> error in the dmesg:
> >>
> >> Device is ineligible for IOMMU domain attach due to platform RMRR
> >> requirement. Contact your platform vendor.
> >>
> >> I have read through the patch description from Alex at:
> >>
> >> http://lists.linuxfoundation.org/pipermail/iommu/2014-June/008816.html
> >>
> >> and I have read the IOMMU documentation at:
> >>
> >> https://www.kernel.org/doc/Documentation/Intel-IOMMU.txt
> >>
> >> but I am still not really understanding if or what the fix is for this.
> >>
> >> The ubuntu 14.04 system where I am getting this error is running
> >> 3.16.0-30-generic
> >> The debian system where I can do similar PCI passthrough of a NVidia K2
> >> GPU device is running a 3.14.29-4 kernel.
> >>
> >> Can anyone provide any insight into an fix or workaround for this?
> > Hi Steven,
> >
> > The issue is that VT-d RMRRs are a platform imposed requirement that a
> > device continue to have identity mapped access to a platform defined
> > memory region at all times.  This requirement is fundamentally
> > incompatible with PCI device assignment where the address space of the
> > assigned device is defined by the VM.  The VT-d specification hints at
> > this restriction (8.4):
> >
> >          The RMRR regions are expected to be used for legacy usages (such
> >          as USB, UMA Graphics, etc.) requiring reserved memory. Platform
> >          designers should avoid or limit use of reserved memory regions
> >          since these require system software to create holes in the DMA
> >          virtual address range available to system software and its
> >          drivers.
> >
> > In order to support assignment of such devices and continue to honor the
> > RMRR, reserved memory regions would need to be imposed on the guest.
> > Doing this has a number of issues and it's not clear that it enables any
> > usable configurations due to the lack of isolation often implied by the
> > RMRRs.  RMRRs themselves imply some sort of communication conduit to the
> > platform, which it's also not clear should be allowed for a guest owned
> > device.
> >
> > We also cannot continue the previous behavior of simply ignoring RMRRs
> > for assigned devices.  Not only does the platform require us to honor
> > them, failing to do so could have implication for both the platform and
> > the VM health and integrity.
> >
> > As indicated by the dmesg warning, users encountering this problem
> > should contact their platform vendor, which is really the only course of
> > action that I can recommend.  Only the platform vendor can tell you why
> > they've imposed this requirement for the device and potentially offer a
> > remedy to remove that requirement.  KVM is not the first hypervisor to
> > impose this restriction for such devices.  The referenced patch was
> > tagged for stable, so you can expect that this change will eventually
> > trickle through all the distributions.  Sorry for the trouble, but it
> > really was a necessary change.  Thanks,
> >
> > Alex
> >
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Device is ineligible for IOMMU domain attach due to platform RMRR requirement
  2015-03-07  4:43     ` Alex Williamson
@ 2015-03-07 10:13       ` Steven DuChene
  2015-03-07 14:41         ` Alex Williamson
  0 siblings, 1 reply; 6+ messages in thread
From: Steven DuChene @ 2015-03-07 10:13 UTC (permalink / raw)
  To: Alex Williamson; +Cc: kvm

Alex:
What would be the result of running an earlier kernel that did not have 
your RMRR patch on a system that was known to have these problems with 
RMRR issues? Would there possibly be some instability when trying to do 
PCI passthrough of these same NVidia devices?

We have a debian install on one of these same systems and it is running 
a 3.14.23-2 kernel and we are seeing some issues with PCI passthrough.
--
Steven DuChene

On 03/06/2015 11:43 PM, Alex Williamson wrote:
> On Fri, 2015-03-06 at 22:10 -0500, Steven DuChene wrote:
>> Alex:
>> Thanks for your quick reply and the information. One question though:
>> When you say contact the platform vendor, are you talking about the
>> vendor of the GPU card (NVidia) or the vendor of the system hardware
>> (HP)? I.E. is the problem in the system BIOS/firmware or in the firmware
>> of the GPU card?
>>
>> This seems like this is going to be the death-knell of PCI passthrough
>> as the likelihood of getting a system vendor to fix some obscure thing
>> like this seems remote.
> Hi Steven,
>
> The problem is in the system firmware; the platform vendor in your case
> is HP.  The issue is actually very limited.  Most platform vendors do
> not make use of RMRRs beyond the recommendations of the VT-d spec.  This
> limits RMRRs in the general case to a small set of devices that are not
> generally used for PCI assignment anyway.  An exemption even exists for
> RMRRs associated with USB devices since their usage is known to be
> limited to early boot.  That effectively limits the scope for most
> vendors to UMA graphics where PCI assignment does not yet work anyway.
> I expect an exemption could also be added there once the RMRR usage is
> discovered and documented.
>
> In the case you've encountered, the RMRR usage is proprietary and we
> cannot know the extent of ongoing usage.  We must therefore assume that
> it is in use and that the RMRR requirement of the platform must be
> honored.
>
> Obviously our goal with this change is not to pick on any specific
> vendor, but to restrict PCI assignment where it can be implemented
> safely, both for the platform and the VM.  RMRRs present a restriction
> in how the IOVA space for a device can be used that we cannot continue
> to ignore and which presents implementation issues to support in a PCI
> device assignment model.  HP engineers as well as the upstream community
> have been consulted on this change and agreed to the restriction.  As I
> said, KVM is not the first hypervisor to implement this restriction and
> PCI assignment continues to be a valuable feature on those hypervisors.
> Even on affected systems, RMRRs typically only apply to physical PCI
> devices.  The vast majority of PCI assignment applications are used with
> networking devices where SR-IOV is far more prevalent and where SR-IOV
> virtual functions are typically unencumbered by RMRRs.
>
> I believe this change is in the best interest of PCI assignment users,
> the scope of affected systems is not as widespread as it might seem from
> your perspective, and workarounds are often available for the most
> common use case in the form of SR-IOV VFs.  Unfortunately we don't have
> SR-IOV for Nvidia Tesla cards, so again, all I can offer is to contact
> the platform vendor to see if there's any chance of a firmware update
> that might remove this restriction.  Thanks,
>
> Alex
>
>
>> On 03/06/2015 01:10 AM, Alex Williamson wrote:
>>> On Fri, 2015-03-06 at 00:20 -0500, Steven DuChene wrote:
>>>> I am attempting on ubuntu 14.04 to configure PCI passthrough of a NVidia
>>>> K40 GPU card that is plugged into a HP DL580 rack mounted server.
>>>> I have done all of the pre-work I normally have done in the past with
>>>> pci-stub, vfio and etc but when I try an execute a qemu-system-x86_64
>>>> command that works on a similar version of debian, I get the following
>>>> error in the dmesg:
>>>>
>>>> Device is ineligible for IOMMU domain attach due to platform RMRR
>>>> requirement. Contact your platform vendor.
>>>>
>>>> I have read through the patch description from Alex at:
>>>>
>>>> http://lists.linuxfoundation.org/pipermail/iommu/2014-June/008816.html
>>>>
>>>> and I have read the IOMMU documentation at:
>>>>
>>>> https://www.kernel.org/doc/Documentation/Intel-IOMMU.txt
>>>>
>>>> but I am still not really understanding if or what the fix is for this.
>>>>
>>>> The ubuntu 14.04 system where I am getting this error is running
>>>> 3.16.0-30-generic
>>>> The debian system where I can do similar PCI passthrough of a NVidia K2
>>>> GPU device is running a 3.14.29-4 kernel.
>>>>
>>>> Can anyone provide any insight into an fix or workaround for this?
>>> Hi Steven,
>>>
>>> The issue is that VT-d RMRRs are a platform imposed requirement that a
>>> device continue to have identity mapped access to a platform defined
>>> memory region at all times.  This requirement is fundamentally
>>> incompatible with PCI device assignment where the address space of the
>>> assigned device is defined by the VM.  The VT-d specification hints at
>>> this restriction (8.4):
>>>
>>>           The RMRR regions are expected to be used for legacy usages (such
>>>           as USB, UMA Graphics, etc.) requiring reserved memory. Platform
>>>           designers should avoid or limit use of reserved memory regions
>>>           since these require system software to create holes in the DMA
>>>           virtual address range available to system software and its
>>>           drivers.
>>>
>>> In order to support assignment of such devices and continue to honor the
>>> RMRR, reserved memory regions would need to be imposed on the guest.
>>> Doing this has a number of issues and it's not clear that it enables any
>>> usable configurations due to the lack of isolation often implied by the
>>> RMRRs.  RMRRs themselves imply some sort of communication conduit to the
>>> platform, which it's also not clear should be allowed for a guest owned
>>> device.
>>>
>>> We also cannot continue the previous behavior of simply ignoring RMRRs
>>> for assigned devices.  Not only does the platform require us to honor
>>> them, failing to do so could have implication for both the platform and
>>> the VM health and integrity.
>>>
>>> As indicated by the dmesg warning, users encountering this problem
>>> should contact their platform vendor, which is really the only course of
>>> action that I can recommend.  Only the platform vendor can tell you why
>>> they've imposed this requirement for the device and potentially offer a
>>> remedy to remove that requirement.  KVM is not the first hypervisor to
>>> impose this restriction for such devices.  The referenced patch was
>>> tagged for stable, so you can expect that this change will eventually
>>> trickle through all the distributions.  Sorry for the trouble, but it
>>> really was a necessary change.  Thanks,
>>>
>>> Alex
>>>
>
>


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Device is ineligible for IOMMU domain attach due to platform RMRR requirement
  2015-03-07 10:13       ` Steven DuChene
@ 2015-03-07 14:41         ` Alex Williamson
  0 siblings, 0 replies; 6+ messages in thread
From: Alex Williamson @ 2015-03-07 14:41 UTC (permalink / raw)
  To: Steven DuChene; +Cc: kvm

On Sat, 2015-03-07 at 05:13 -0500, Steven DuChene wrote:
> Alex:
> What would be the result of running an earlier kernel that did not have 
> your RMRR patch on a system that was known to have these problems with 
> RMRR issues? Would there possibly be some instability when trying to do 
> PCI passthrough of these same NVidia devices?
> 
> We have a debian install on one of these same systems and it is running 
> a 3.14.23-2 kernel and we are seeing some issues with PCI passthrough.

The potential problems depend on whether the RMRR memory region is
actually used, whether the device or the platform depend on that ongoing
access, and the address space consumed by the RMRR.  In the case of a
VM, the fear is that the RMRR is used as a data reporting location by
the device, that use continues after the device is assigned to the VM,
and that the RMRR memory region overlaps guest RAM.  If all of those
conditions hold true, then we have a memory integrity issue in the VM.
On the platform side, whatever data the platform was depending on the
device to report is now disconnected from specified data reporting range
and lost.

In the case of an Nvidia device, I'd speculate that the more likely
cause of issues for that kernel would be that the VT-d hardware may not
support snoop-control and the kvm-vfio interface that manages whether
KVM emulates cache coherence instructions wasn't added until kernel
v3.15.  Thanks,

Alex

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2015-03-07 14:41 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-03-06  5:20 Device is ineligible for IOMMU domain attach due to platform RMRR requirement Steven DuChene
2015-03-06  6:10 ` Alex Williamson
2015-03-07  3:10   ` Steven DuChene
2015-03-07  4:43     ` Alex Williamson
2015-03-07 10:13       ` Steven DuChene
2015-03-07 14:41         ` Alex Williamson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox