All of lore.kernel.org
 help / color / mirror / Atom feed
From: James Turner <linuxkernel.foss@dmarc-none.turner.link>
To: Alex Deucher <alexdeucher@gmail.com>
Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>,
	regressions@lists.linux.dev, kvm@vger.kernel.org,
	"Greg KH" <gregkh@linuxfoundation.org>,
	"Lijo Lazar" <lijo.lazar@amd.com>,
	LKML <linux-kernel@vger.kernel.org>,
	"amd-gfx@lists.freedesktop.org" <amd-gfx@lists.freedesktop.org>,
	"Alex Williamson" <alex.williamson@redhat.com>,
	"Thorsten Leemhuis" <regressions@leemhuis.info>,
	"Alex Deucher" <alexander.deucher@amd.com>,
	"Christian König" <christian.koenig@amd.com>
Subject: Re: [REGRESSION] Too-low frequency limit for AMD GPU PCI-passed-through to Windows VM
Date: Fri, 21 Jan 2022 19:51:11 -0500	[thread overview]
Message-ID: <87czkk1pmt.fsf@dmarc-none.turner.link> (raw)
In-Reply-To: <CADnq5_Nr5-FR2zP1ViVsD_ZMiW=UHC1wO8_HEGm26K_EG2KDoA@mail.gmail.com>

> Are you ever loading the amdgpu driver in your tests?

Yes, although I'm binding the `vfio-pci` driver to the AMD GPU's PCI
devices via the kernel command line. (See my initial email.) My
understanding is that `vfio-pci` is supposed to keep other drivers, such
as `amdgpu`, from interacting with the GPU, although that's clearly not
what's happening.

I've been testing with `amdgpu` included in the `MODULES` list in
`/etc/mkinitcpio.conf` (which Arch Linux uses to generate the
initramfs). However, I ran some more tests today (results below), this
time without `i915` or `amdgpu` in the `MODULES` list. The `amdgpu`
kernel module still gets loaded. (I think udev loads it automatically?)

Your comment gave me the idea to blacklist the `amdgpu` kernel module.
That does serve as a workaround on my machine – it fixes the behavior
for f9b7f3703ff9 ("drm/amdgpu/acpi: make ATPX/ATCS structures global (v2)")
and for the current Arch Linux prebuilt kernel (5.16.2-arch1-1). That's
an acceptable workaround for my machine only because the separate GPU
used by the host is an Intel integrated GPU. That workaround wouldn't
work well for someone with two AMD GPUs.


# New test results

The following tests are set up the same way as in my initial email,
with the following exceptions:

- I've updated libvirt to 1:8.0.0-1.

- I've removed `i915` and `amdgpu` from the `MODULES` list in
  `/etc/mkinitcpio.conf`.

For all three of these tests, `lspci` said the following:

% lspci -nnk -d 1002:6981
01:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Lexa XT [Radeon PRO WX 3200] [1002:6981]
	Subsystem: Dell Device [1028:0926]
	Kernel driver in use: vfio-pci
	Kernel modules: amdgpu

% lspci -nnk -d 1002:aae0
01:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Baffin HDMI/DP Audio [Radeon RX 550 640SP / RX 560/560X] [1002:aae0]
	Subsystem: Dell Device [1028:0926]
	Kernel driver in use: vfio-pci
	Kernel modules: snd_hda_intel


## Version f1688bd69ec4 ("drm/amd/amdgpu:save psp ring wptr to avoid attack")

This is the commit immediately preceding the one which introduced the issue.

% sudo dmesg | grep -i amdgpu
[   15.840160] [drm] amdgpu kernel modesetting enabled.
[   15.840884] amdgpu: CRAT table not found
[   15.840885] amdgpu: Virtual CRAT table created for CPU
[   15.840893] amdgpu: Topology: Add CPU node

% lsmod | grep amdgpu
amdgpu               7450624  0
gpu_sched              49152  1 amdgpu
drm_ttm_helper         16384  1 amdgpu
ttm                    77824  2 amdgpu,drm_ttm_helper
i2c_algo_bit           16384  2 amdgpu,i915
drm_kms_helper        303104  2 amdgpu,i915
drm                   581632  11 gpu_sched,drm_kms_helper,amdgpu,drm_ttm_helper,i915,ttm

The passed-through GPU worked properly in the VM.


## Version f9b7f3703ff9 ("drm/amdgpu/acpi: make ATPX/ATCS structures global (v2)")

This is the commit which introduced the issue.

% sudo dmesg | grep -i amdgpu
[   15.319023] [drm] amdgpu kernel modesetting enabled.
[   15.329468] amdgpu: CRAT table not found
[   15.329470] amdgpu: Virtual CRAT table created for CPU
[   15.329482] amdgpu: Topology: Add CPU node

% lsmod | grep amdgpu
amdgpu               7450624  0
gpu_sched              49152  1 amdgpu
drm_ttm_helper         16384  1 amdgpu
ttm                    77824  2 amdgpu,drm_ttm_helper
i2c_algo_bit           16384  2 amdgpu,i915
drm_kms_helper        303104  2 amdgpu,i915
drm                   581632  11 gpu_sched,drm_kms_helper,amdgpu,drm_ttm_helper,i915,ttm

The passed-through GPU did not run above 501 MHz in the VM.


## Blacklisted `amdgpu`, version f9b7f3703ff9 ("drm/amdgpu/acpi: make ATPX/ATCS structures global (v2)")

For this test, I added `module_blacklist=amdgpu` to kernel command line
to blacklist the `amdgpu` module.

% sudo dmesg | grep -i amdgpu
[   14.591576] Module amdgpu is blacklisted

% lsmod | grep amdgpu

The passed-through GPU worked properly in the VM.


James

WARNING: multiple messages have this Message-ID (diff)
From: James Turner <linuxkernel.foss@dmarc-none.turner.link>
To: Alex Deucher <alexdeucher@gmail.com>
Cc: "Thorsten Leemhuis" <regressions@leemhuis.info>,
	"Alex Deucher" <alexander.deucher@amd.com>,
	"Lijo Lazar" <lijo.lazar@amd.com>,
	regressions@lists.linux.dev, kvm@vger.kernel.org,
	"Greg KH" <gregkh@linuxfoundation.org>,
	"Pan, Xinhui" <Xinhui.Pan@amd.com>,
	LKML <linux-kernel@vger.kernel.org>,
	"amd-gfx@lists.freedesktop.org" <amd-gfx@lists.freedesktop.org>,
	"Alex Williamson" <alex.williamson@redhat.com>,
	"Christian König" <christian.koenig@amd.com>
Subject: Re: [REGRESSION] Too-low frequency limit for AMD GPU PCI-passed-through to Windows VM
Date: Fri, 21 Jan 2022 19:51:11 -0500	[thread overview]
Message-ID: <87czkk1pmt.fsf@dmarc-none.turner.link> (raw)
In-Reply-To: <CADnq5_Nr5-FR2zP1ViVsD_ZMiW=UHC1wO8_HEGm26K_EG2KDoA@mail.gmail.com>

> Are you ever loading the amdgpu driver in your tests?

Yes, although I'm binding the `vfio-pci` driver to the AMD GPU's PCI
devices via the kernel command line. (See my initial email.) My
understanding is that `vfio-pci` is supposed to keep other drivers, such
as `amdgpu`, from interacting with the GPU, although that's clearly not
what's happening.

I've been testing with `amdgpu` included in the `MODULES` list in
`/etc/mkinitcpio.conf` (which Arch Linux uses to generate the
initramfs). However, I ran some more tests today (results below), this
time without `i915` or `amdgpu` in the `MODULES` list. The `amdgpu`
kernel module still gets loaded. (I think udev loads it automatically?)

Your comment gave me the idea to blacklist the `amdgpu` kernel module.
That does serve as a workaround on my machine – it fixes the behavior
for f9b7f3703ff9 ("drm/amdgpu/acpi: make ATPX/ATCS structures global (v2)")
and for the current Arch Linux prebuilt kernel (5.16.2-arch1-1). That's
an acceptable workaround for my machine only because the separate GPU
used by the host is an Intel integrated GPU. That workaround wouldn't
work well for someone with two AMD GPUs.


# New test results

The following tests are set up the same way as in my initial email,
with the following exceptions:

- I've updated libvirt to 1:8.0.0-1.

- I've removed `i915` and `amdgpu` from the `MODULES` list in
  `/etc/mkinitcpio.conf`.

For all three of these tests, `lspci` said the following:

% lspci -nnk -d 1002:6981
01:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Lexa XT [Radeon PRO WX 3200] [1002:6981]
	Subsystem: Dell Device [1028:0926]
	Kernel driver in use: vfio-pci
	Kernel modules: amdgpu

% lspci -nnk -d 1002:aae0
01:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Baffin HDMI/DP Audio [Radeon RX 550 640SP / RX 560/560X] [1002:aae0]
	Subsystem: Dell Device [1028:0926]
	Kernel driver in use: vfio-pci
	Kernel modules: snd_hda_intel


## Version f1688bd69ec4 ("drm/amd/amdgpu:save psp ring wptr to avoid attack")

This is the commit immediately preceding the one which introduced the issue.

% sudo dmesg | grep -i amdgpu
[   15.840160] [drm] amdgpu kernel modesetting enabled.
[   15.840884] amdgpu: CRAT table not found
[   15.840885] amdgpu: Virtual CRAT table created for CPU
[   15.840893] amdgpu: Topology: Add CPU node

% lsmod | grep amdgpu
amdgpu               7450624  0
gpu_sched              49152  1 amdgpu
drm_ttm_helper         16384  1 amdgpu
ttm                    77824  2 amdgpu,drm_ttm_helper
i2c_algo_bit           16384  2 amdgpu,i915
drm_kms_helper        303104  2 amdgpu,i915
drm                   581632  11 gpu_sched,drm_kms_helper,amdgpu,drm_ttm_helper,i915,ttm

The passed-through GPU worked properly in the VM.


## Version f9b7f3703ff9 ("drm/amdgpu/acpi: make ATPX/ATCS structures global (v2)")

This is the commit which introduced the issue.

% sudo dmesg | grep -i amdgpu
[   15.319023] [drm] amdgpu kernel modesetting enabled.
[   15.329468] amdgpu: CRAT table not found
[   15.329470] amdgpu: Virtual CRAT table created for CPU
[   15.329482] amdgpu: Topology: Add CPU node

% lsmod | grep amdgpu
amdgpu               7450624  0
gpu_sched              49152  1 amdgpu
drm_ttm_helper         16384  1 amdgpu
ttm                    77824  2 amdgpu,drm_ttm_helper
i2c_algo_bit           16384  2 amdgpu,i915
drm_kms_helper        303104  2 amdgpu,i915
drm                   581632  11 gpu_sched,drm_kms_helper,amdgpu,drm_ttm_helper,i915,ttm

The passed-through GPU did not run above 501 MHz in the VM.


## Blacklisted `amdgpu`, version f9b7f3703ff9 ("drm/amdgpu/acpi: make ATPX/ATCS structures global (v2)")

For this test, I added `module_blacklist=amdgpu` to kernel command line
to blacklist the `amdgpu` module.

% sudo dmesg | grep -i amdgpu
[   14.591576] Module amdgpu is blacklisted

% lsmod | grep amdgpu

The passed-through GPU worked properly in the VM.


James

  reply	other threads:[~2022-01-22 10:45 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-17  2:12 [REGRESSION] Too-low frequency limit for AMD GPU PCI-passed-through to Windows VM James D. Turner
2022-01-17  8:09 ` Greg KH
2022-01-17  9:03 ` Thorsten Leemhuis
2022-01-18  3:14   ` James Turner
2022-01-21  2:13     ` James Turner
2022-01-21  6:22       ` Thorsten Leemhuis
2022-01-21  6:22         ` Thorsten Leemhuis
2022-01-21 16:45         ` Alex Deucher
2022-01-21 16:45           ` Alex Deucher
2022-01-22  0:51           ` James Turner [this message]
2022-01-22  0:51             ` James Turner
2022-01-22  5:52             ` Lazar, Lijo
2022-01-22  5:52               ` Lazar, Lijo
2022-01-22 21:11               ` James Turner
2022-01-22 21:11                 ` James Turner
2022-01-24 14:21                 ` Lazar, Lijo
2022-01-24 14:21                   ` Lazar, Lijo
2022-01-24 23:58                   ` James Turner
2022-01-24 23:58                     ` James Turner
2022-01-25 13:33                     ` Lazar, Lijo
2022-01-25 13:33                       ` Lazar, Lijo
2022-01-30  0:25                       ` Jim Turner
2022-01-30  0:25                         ` Jim Turner
2022-02-15 14:56                         ` Thorsten Leemhuis
2022-02-15 14:56                           ` Thorsten Leemhuis
2022-02-15 15:11                           ` Alex Deucher
2022-02-15 15:11                             ` Alex Deucher
2022-02-16  0:25                             ` James D. Turner
2022-02-16  0:25                               ` James D. Turner
2022-02-16 16:37                               ` Alex Deucher
2022-02-16 16:37                                 ` Alex Deucher
2022-03-06 15:48                                 ` Thorsten Leemhuis
2022-03-06 15:48                                   ` Thorsten Leemhuis
2022-03-07  2:12                                   ` James Turner
2022-03-07  2:12                                     ` James Turner
2022-03-13 18:33                                     ` James Turner
2022-03-13 18:33                                       ` James Turner
2022-03-17 12:54                                       ` Thorsten Leemhuis
2022-03-17 12:54                                         ` Thorsten Leemhuis
2022-03-18  5:43                                         ` Paul Menzel
2022-03-18  5:43                                           ` Paul Menzel
2022-03-18  7:01                                           ` Thorsten Leemhuis
2022-03-18  7:01                                             ` Thorsten Leemhuis
2022-03-18 14:46                                             ` Alex Williamson
2022-03-18 14:46                                               ` Alex Williamson
2022-03-18 15:06                                               ` Alex Deucher
2022-03-18 15:06                                                 ` Alex Deucher
2022-03-18 15:25                                                 ` Alex Williamson
2022-03-18 15:25                                                   ` Alex Williamson
2022-03-21  1:26                                                   ` James Turner
2022-03-21  1:26                                                     ` James Turner
2022-01-24 17:04                 ` Alex Deucher
2022-01-24 17:04                   ` Alex Deucher
2022-01-24 17:30                   ` Alex Williamson
2022-01-24 17:30                     ` Alex Williamson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87czkk1pmt.fsf@dmarc-none.turner.link \
    --to=linuxkernel.foss@dmarc-none.turner.link \
    --cc=Xinhui.Pan@amd.com \
    --cc=alex.williamson@redhat.com \
    --cc=alexander.deucher@amd.com \
    --cc=alexdeucher@gmail.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=christian.koenig@amd.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=kvm@vger.kernel.org \
    --cc=lijo.lazar@amd.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=regressions@leemhuis.info \
    --cc=regressions@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.