Re: [PATCH] drm/amdgpu: Mark contexts guilty for any reset type

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: "Michel Dänzer" <michel.daenzer@mailbox.org>
To: "Marek Olšák" <maraeo@gmail.com>,
	"Christian König" <christian.koenig@amd.com>
Cc: "Pierre-Eric Pelloux-Prayer" <pierre-eric.pelloux-prayer@amd.com>,
	"André Almeida" <andrealmeid@igalia.com>,
	"Linux Kernel Mailing List" <linux-kernel@vger.kernel.org>,
	dri-devel <dri-devel@lists.freedesktop.org>,
	"Tuikov, Luben" <Luben.Tuikov@amd.com>,
	"amd-gfx mailing list" <amd-gfx@lists.freedesktop.org>,
	kernel-dev@igalia.com, "Deucher,
	Alexander" <alexander.deucher@amd.com>
Subject: Re: [PATCH] drm/amdgpu: Mark contexts guilty for any reset type
Date: Wed, 26 Apr 2023 11:51:50 +0200	[thread overview]
Message-ID: <9087ef09-e617-dcf3-343e-162f79dc3e51@mailbox.org> (raw)
In-Reply-To: <CAAxE2A4capwpc40F49cgZBC9jJisODqNjTe0cM_pS7si5EkW3g@mail.gmail.com>

On 4/25/23 21:11, Marek Olšák wrote:
> The last 3 comments in this thread contain arguments that are false and were specifically pointed out as false 6 comments ago: Soft resets are just as fatal as hard resets. There is nothing better about soft resets. If the VRAM is lost completely, that's a different story, and if the hard reset is 100% unreliable, that's also a different story, but other than those two outliers, there is no difference between the two from the user point view. Both can repeatedly hang if you don't prevent the app that caused the hang from using the GPU even if the app is not robust. The robustness context type doesn't matter here. By definition, no guilty app can continue after a reset, and no innocent apps affected by a reset can continue either because those can now hang too. That's how destructive all resets are. Personal anecdotes that the soft reset is better are just that, anecdotes.

You're trying to frame the situation as black or white, but reality is shades of grey.


There's a similar situation with kernel Oopsen: In principle it's not safe to continue executing the kernel after it hits an Oops, since it might be in an inconsistent state, which could result in any kind of misbehaviour. Still, the default behaviour is to continue executing, and in most cases it turns out fine. Users which cannot accept the residual risk can choose to make the kernel panic when it hits an Oops (either via CONFIG_PANIC_ON_OOPS at build time, or via oops=panic on the kernel command line). A kernel panic means that the machine basically freezes from a user PoV, which would be worse as the default behaviour for most users (because it would e.g. incur a higher risk of losing filesystem data).


-- 
Earthling Michel Dänzer            |                  https://redhat.com
Libre software enthusiast          |         Mesa and Xwayland developer

     prev parent reply	other threads:[~2023-04-26  9:52 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-24  1:43 [PATCH] drm/amdgpu: Mark contexts guilty for any reset type André Almeida
2023-04-24  5:11 ` kernel test robot
2023-04-24  7:03 ` Christian König
2023-04-24 13:26   ` André Almeida
2023-04-24 15:27     ` Michel Dänzer
     [not found]   ` <CAAxE2A6Soq28ACV-m1OzG8CA-_VWp+N2wapsABzm2Nda=Qe+yA@mail.gmail.com>
2023-04-25 10:27     ` Michel Dänzer
     [not found]       ` <CAAxE2A6iuuVA7zjHM8YcTGMpEWuYV=hGRR1YW6W-qXHwAg9w7w@mail.gmail.com>
     [not found]         ` <19406ec5-79d6-e9e6-fbdd-eb2f4a872fc4@amd.com>
2023-04-25 12:14           ` Michel Dänzer
2023-04-25 12:44             ` Christian König
     [not found]               ` <CAAxE2A4capwpc40F49cgZBC9jJisODqNjTe0cM_pS7si5EkW3g@mail.gmail.com>
2023-04-26  9:51                 ` Michel Dänzer [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9087ef09-e617-dcf3-343e-162f79dc3e51@mailbox.org \
    --to=michel.daenzer@mailbox.org \
    --cc=Luben.Tuikov@amd.com \
    --cc=alexander.deucher@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=andrealmeid@igalia.com \
    --cc=christian.koenig@amd.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=kernel-dev@igalia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maraeo@gmail.com \
    --cc=pierre-eric.pelloux-prayer@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox