public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Christian König" <ckoenig.leichtzumerken@gmail.com>
To: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Cc: amd-gfx list <amd-gfx@lists.freedesktop.org>,
	dri-devel <dri-devel@lists.freedesktop.org>,
	Linux List Kernel Mailing <linux-kernel@vger.kernel.org>,
	"Deucher, Alexander" <Alexander.Deucher@amd.com>
Subject: Re: amdgpu didn't start with pci=nocrs parameter, get error "Fatal error during GPU init"
Date: Tue, 28 Feb 2023 13:43:01 +0100	[thread overview]
Message-ID: <5cbba992-c4ce-01c1-2691-ed65ce66aad5@gmail.com> (raw)
In-Reply-To: <CABXGCsOJkF=c4B+oQm7cuEO7Fr_oknmH2iB6e6OCzmFy=KYtAw@mail.gmail.com>

Am 28.02.23 um 10:52 schrieb Mikhail Gavrilov:
> On Mon, Feb 27, 2023 at 3:22 PM Christian König
>> Unfortunately yes. We could clean that up a bit more so that you don't
>> run into a BUG() assertion, but what essentially happens here is that we
>> completely fail to talk to the hardware.
>>
>> In this situation we can't even re-enable vesa or text console any more.
>>
> Then I don't understand why when amdgpu is blacklisted via
> modprobe.blacklist=amdgpu then I see graphics and could login into
> GNOME. Yes without hardware acceleration, but it is better than non
> working graphics. It means there is some other driver (I assume this
> is "video") which can successfully talk to the AMD hardware in
> conditions where amdgpu cannot do this.

The point is it doesn't need to talk to the amdgpu hardware. What it 
does is that it talks to the good old VGA/VESA emulation and that just 
happens to be still enabled by the BIOS/GRUB.

And that VGA/VESA emulation doesn't need any BAR or whatever to keep the 
hw running in the state where it was initialized before the kernel 
started. The kernel just grabs the addresses where it needs to write the 
display data and keeps going with that.

But when a hw specific driver wants to load this is the first thing 
which gets disabled because we need to load new firmware. And with the 
BARs disabled this can't be re-enabled without rebooting the system.

> My suggestion is that if
> amdgpu fails to talk to the hardware, then let another suitable driver
> do it. I attached a system log when I apply "pci=nocrs" with
> "modprobe.blacklist=amdgpu" for showing that graphics work right in
> this case.
> To do this, does the Linux module loading mechanism need to be refined?

That's actually working as expected. The real problem is that the BIOS 
on that system is so broken that we can't access the hw correctly.

What we could to do is to check the BARs very early on and refuse to 
load when they are disable. The problem with this approach is that there 
are systems where it is normal that the BARs are disable until the 
driver loads and get enabled during the hardware initialization process.

What you might want to look into is to find a quirk for the BIOS to 
properly enable the nvme controller.

Regards,
Christian.


  reply	other threads:[~2023-02-28 12:43 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-23 23:40 amdgpu didn't start with pci=nocrs parameter, get error "Fatal error during GPU init" Mikhail Gavrilov
2023-02-24  7:12 ` Keyword Review - " Christian König
2023-02-24  7:13 ` Christian König
2023-02-24  8:38   ` Mikhail Gavrilov
2023-02-24 12:29     ` Christian König
2023-02-24 15:31       ` Christian König
2023-02-24 16:21         ` Mikhail Gavrilov
2023-02-27 10:22           ` Christian König
2023-02-28  9:52             ` Mikhail Gavrilov
2023-02-28 12:43               ` Christian König [this message]
2023-12-15 11:45                 ` Mikhail Gavrilov
2023-12-15 12:37                   ` Christian König
2023-12-19  9:45                     ` Mikhail Gavrilov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5cbba992-c4ce-01c1-2691-ed65ce66aad5@gmail.com \
    --to=ckoenig.leichtzumerken@gmail.com \
    --cc=Alexander.Deucher@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mikhail.v.gavrilov@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox