From: "Christian König" <ckoenig.leichtzumerken@gmail.com>
To: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>,
amd-gfx list <amd-gfx@lists.freedesktop.org>,
dri-devel <dri-devel@lists.freedesktop.org>,
Linux List Kernel Mailing <linux-kernel@vger.kernel.org>,
"Deucher, Alexander" <Alexander.Deucher@amd.com>
Subject: Re: amdgpu didn't start with pci=nocrs parameter, get error "Fatal error during GPU init"
Date: Fri, 24 Feb 2023 08:13:52 +0100 [thread overview]
Message-ID: <a99e6def-68be-3f2b-4e01-ac26cdb80f49@gmail.com> (raw)
In-Reply-To: <CABXGCsMbqw2qzWSCDfp3cNrYVJ1oxLv8Aixfm_Dt91x1cvFX4w@mail.gmail.com>
Hi Mikhail,
this is pretty clearly a problem with the system and/or it's BIOS and
not the GPU hw or the driver.
The option pci=nocrs makes the kernel ignore additional resource windows
the BIOS reports through ACPI. This then most likely leads to problems
with amdgpu because it can't bring up its PCIe resources any more.
The output of "sudo lspci -vvvv -s $BUSID_OF_AMDGPU" might help
understand the problem, but I strongly suggest to try a BIOS update first.
Regards,
Christian.
Am 24.02.23 um 00:40 schrieb Mikhail Gavrilov:
> Hi,
> I have a laptop ASUS ROG Strix G15 Advantage Edition G513QY-HQ007. But
> it is impossible to use without AC power because the system losts nvme
> when I disconnect the power adapter.
>
> Messages from kernel log when it happens:
> nvme nvme0: controller is down; will reset: CSTS=0xffffffff, PCI_STATUS=0x10
> nvme nvme0: Does your device have a faulty power saving mode enabled?
> nvme nvme0: Try "nvme_core.default_ps_max_latency_us=0 pcie_aspm=off"
> and report a bug
>
> I tried to use recommended parameters
> (nvme_core.default_ps_max_latency_us=0 and pcie_aspm=off) to resolve
> this issue, but without successed.
>
> In the linux-nvme mail list the last advice was to try the "pci=nocrs"
> parameter.
>
> But with this parameter the amdgpu driver refuses to work and makes
> the system unbootable. I can solve the problem with the booting system
> by blacklisting the driver but it is not a good solution, because I
> don't wanna lose the GPU.
>
> Why amdgpu not work with "pci=nocrs" ?
> And is it possible to solve this incompatibility?
> It is very important because when I boot the system without amdgpu
> driver with "pci=nocrs" nvme is not losts when I disconnect the power
> adapter. So "pci=nocrs" really helps.
>
> Below that I see in kernel log when adds "pci=nocrs" parameter:
>
> amdgpu 0000:03:00.0: amdgpu: Fetched VBIOS from ATRM
> amdgpu: ATOM BIOS: SWBRT77321.001
> [drm] VCN(0) decode is enabled in VM mode
> [drm] VCN(0) encode is enabled in VM mode
> [drm] JPEG decode is enabled in VM mode
> Console: switching to colour dummy device 80x25
> amdgpu 0000:03:00.0: amdgpu: Trusted Memory Zone (TMZ) feature
> disabled as experimental (default)
> [drm] GPU posting now...
> [drm] vm size is 262144 GB, 4 levels, block size is 9-bit, fragment
> size is 9-bit
> amdgpu 0000:03:00.0: amdgpu: VRAM: 12272M 0x0000008000000000 -
> 0x00000082FEFFFFFF (12272M used)
> amdgpu 0000:03:00.0: amdgpu: GART: 512M 0x0000000000000000 - 0x000000001FFFFFFF
> amdgpu 0000:03:00.0: amdgpu: AGP: 267894784M 0x0000008400000000 -
> 0x0000FFFFFFFFFFFF
> [drm] Detected VRAM RAM=12272M, BAR=16384M
> [drm] RAM width 192bits GDDR6
> [drm] amdgpu: 12272M of VRAM memory ready
> [drm] amdgpu: 31774M of GTT memory ready.
> amdgpu 0000:03:00.0: amdgpu: (-14) failed to allocate kernel bo
> [drm] Debug VRAM access will use slowpath MM access
> amdgpu 0000:03:00.0: amdgpu: Failed to DMA MAP the dummy page
> [drm:amdgpu_device_init [amdgpu]] *ERROR* sw_init of IP block
> <gmc_v10_0> failed -12
> amdgpu 0000:03:00.0: amdgpu: amdgpu_device_ip_init failed
> amdgpu 0000:03:00.0: amdgpu: Fatal error during GPU init
> amdgpu 0000:03:00.0: amdgpu: amdgpu: finishing device.
>
> Of course a full system log is also attached.
>
next prev parent reply other threads:[~2023-02-24 7:14 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-02-23 23:40 amdgpu didn't start with pci=nocrs parameter, get error "Fatal error during GPU init" Mikhail Gavrilov
2023-02-24 7:12 ` Keyword Review - " Christian König
2023-02-24 7:13 ` Christian König [this message]
2023-02-24 8:38 ` Mikhail Gavrilov
2023-02-24 12:29 ` Christian König
2023-02-24 15:31 ` Christian König
2023-02-24 16:21 ` Mikhail Gavrilov
2023-02-27 10:22 ` Christian König
2023-02-28 9:52 ` Mikhail Gavrilov
2023-02-28 12:43 ` Christian König
2023-12-15 11:45 ` Mikhail Gavrilov
2023-12-15 12:37 ` Christian König
2023-12-19 9:45 ` Mikhail Gavrilov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a99e6def-68be-3f2b-4e01-ac26cdb80f49@gmail.com \
--to=ckoenig.leichtzumerken@gmail.com \
--cc=Alexander.Deucher@amd.com \
--cc=amd-gfx@lists.freedesktop.org \
--cc=dri-devel@lists.freedesktop.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mikhail.v.gavrilov@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox