Linux IOMMU Development
 help / color / mirror / Atom feed
* Scalable mode and kdump kernels
@ 2022-07-02 15:49 Jerry Snitselaar
  2022-07-03  4:56 ` Baolu Lu
  2022-08-08  3:56 ` Baolu Lu
  0 siblings, 2 replies; 3+ messages in thread
From: Jerry Snitselaar @ 2022-07-02 15:49 UTC (permalink / raw)
  To: iommu, Joerg Roedel, Lu Baolu

Apparently there is an issue on Sapphire Rapids if you have the following conditions:

- kernel configured to default to passthrough dma domains, or set it on the command line
- scalable mode enabled

If you force a system crash, there will be a number of DMAR faults
when booting the kdump kernel, with the result being it doesn't mount
the disk and harvest the vmcore.  The fault reason is 0x39, which is no
present bit in the scalable mode root entry.

Looking at the translation table copying code, it looks like what is
there currently is based on the older 2.5 vt-d spec and the extended
root/context entry formats, and doesn't handle the scalable mode
formats. The pasid enabled bit has moved in the scalable mode context
entry from bit 11 to bit 3, and the translation type mode in the root
table address register is different as well. Poking around at it last
night, adding code to deal with those just shifts to the problem to
the present bit in the scalable mode pasid table entry (fault reason
0x59) which isn't copied.

Beyond the translation table copy bits needing to work with scalable
mode formats, should scalable mode force the default dma domain type
to switch to one of the translation modes if the default is passthrough?


Regards,
Jerry


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Scalable mode and kdump kernels
  2022-07-02 15:49 Scalable mode and kdump kernels Jerry Snitselaar
@ 2022-07-03  4:56 ` Baolu Lu
  2022-08-08  3:56 ` Baolu Lu
  1 sibling, 0 replies; 3+ messages in thread
From: Baolu Lu @ 2022-07-03  4:56 UTC (permalink / raw)
  To: Jerry Snitselaar, iommu, Joerg Roedel; +Cc: baolu.lu

On 2022/7/2 23:49, Jerry Snitselaar wrote:
> Apparently there is an issue on Sapphire Rapids if you have the following conditions:
> 
> - kernel configured to default to passthrough dma domains, or set it on the command line
> - scalable mode enabled
> 
> If you force a system crash, there will be a number of DMAR faults
> when booting the kdump kernel, with the result being it doesn't mount
> the disk and harvest the vmcore.  The fault reason is 0x39, which is no
> present bit in the scalable mode root entry.
> 
> Looking at the translation table copying code, it looks like what is
> there currently is based on the older 2.5 vt-d spec and the extended
> root/context entry formats, and doesn't handle the scalable mode

Yes. It is written for ECS mode, which has no real hardware
implementation in the market.

> formats. The pasid enabled bit has moved in the scalable mode context
> entry from bit 11 to bit 3, and the translation type mode in the root
> table address register is different as well. Poking around at it last
> night, adding code to deal with those just shifts to the problem to
> the present bit in the scalable mode pasid table entry (fault reason
> 0x59) which isn't copied.

The translation mode bits are in the pasid table entries, hence to make
that work, the pasid tables should also be copied.

> 
> Beyond the translation table copy bits needing to work with scalable
> mode formats, should scalable mode force the default dma domain type
> to switch to one of the translation modes if the default is passthrough?

Agreed. The copy table code needs some extensions to support the
scalable mode.

Best regards,
baolu

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Scalable mode and kdump kernels
  2022-07-02 15:49 Scalable mode and kdump kernels Jerry Snitselaar
  2022-07-03  4:56 ` Baolu Lu
@ 2022-08-08  3:56 ` Baolu Lu
  1 sibling, 0 replies; 3+ messages in thread
From: Baolu Lu @ 2022-08-08  3:56 UTC (permalink / raw)
  To: Jerry Snitselaar, iommu, Joerg Roedel; +Cc: baolu.lu

Hi Jerry,

On 2022/7/2 23:49, Jerry Snitselaar wrote:
> Apparently there is an issue on Sapphire Rapids if you have the following conditions:
> 
> - kernel configured to default to passthrough dma domains, or set it on the command line
> - scalable mode enabled
> 
> If you force a system crash, there will be a number of DMAR faults
> when booting the kdump kernel, with the result being it doesn't mount
> the disk and harvest the vmcore.  The fault reason is 0x39, which is no
> present bit in the scalable mode root entry.
> 
> Looking at the translation table copying code, it looks like what is
> there currently is based on the older 2.5 vt-d spec and the extended
> root/context entry formats, and doesn't handle the scalable mode
> formats. The pasid enabled bit has moved in the scalable mode context
> entry from bit 11 to bit 3, and the translation type mode in the root
> table address register is different as well. Poking around at it last
> night, adding code to deal with those just shifts to the problem to
> the present bit in the scalable mode pasid table entry (fault reason
> 0x59) which isn't copied.
> 
> Beyond the translation table copy bits needing to work with scalable
> mode formats, should scalable mode force the default dma domain type
> to switch to one of the translation modes if the default is passthrough?

I posted a fix patch for this issue. Can you please give it a try?

https://lore.kernel.org/linux-iommu/20220808034612.1691470-1-baolu.lu@linux.intel.com/

Best regards,
baolu

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-08-08  3:56 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-07-02 15:49 Scalable mode and kdump kernels Jerry Snitselaar
2022-07-03  4:56 ` Baolu Lu
2022-08-08  3:56 ` Baolu Lu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox