public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Pekka Paalanen <ppaalanen@gmail.com>
To: "Michel Dänzer" <michel.daenzer@mailbox.org>
Cc: "Marek Olšák" <maraeo@gmail.com>,
	pierre-eric.pelloux-prayer@amd.com,
	"Sebastian Wick" <sebastian.wick@redhat.com>,
	amd-gfx@lists.freedesktop.org,
	"André Almeida" <andrealmeid@igalia.com>,
	"Timur Kristóf" <timur.kristof@gmail.com>,
	"Randy Dunlap" <rdunlap@infradead.org>,
	linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org,
	alexander.deucher@amd.com,
	"Samuel Pitoiset" <samuel.pitoiset@gmail.com>,
	kernel-dev@igalia.com, christian.koenig@amd.com
Subject: Re: [PATCH v5 1/1] drm/doc: Document DRM device reset expectations
Date: Mon, 3 Jul 2023 11:49:49 +0300	[thread overview]
Message-ID: <20230703114949.796c7498@eldfell> (raw)
In-Reply-To: <7c1e6df5-1ad4-be3c-b95d-92dc62a8c537@mailbox.org>

[-- Attachment #1: Type: text/plain, Size: 3767 bytes --]

On Mon, 3 Jul 2023 09:12:29 +0200
Michel Dänzer <michel.daenzer@mailbox.org> wrote:

> On 6/30/23 22:32, Marek Olšák wrote:
> > On Fri, Jun 30, 2023 at 11:11 AM Michel Dänzer <michel.daenzer@mailbox.org <mailto:michel.daenzer@mailbox.org>> wrote:  
> >> On 6/30/23 16:59, Alex Deucher wrote:  
> >>> On Fri, Jun 30, 2023 at 10:49 AM Sebastian Wick
> >>> <sebastian.wick@redhat.com <mailto:sebastian.wick@redhat.com>> wrote:  
> >>>> On Tue, Jun 27, 2023 at 3:23 PM André Almeida <andrealmeid@igalia.com <mailto:andrealmeid@igalia.com>> wrote:  
> >>>>>
> >>>>> +Robustness
> >>>>> +----------
> >>>>> +
> >>>>> +The only way to try to keep an application working after a reset is if it
> >>>>> +complies with the robustness aspects of the graphical API that it is using.
> >>>>> +
> >>>>> +Graphical APIs provide ways to applications to deal with device resets. However,
> >>>>> +there is no guarantee that the app will use such features correctly, and the
> >>>>> +UMD can implement policies to close the app if it is a repeating offender,
> >>>>> +likely in a broken loop. This is done to ensure that it does not keep blocking
> >>>>> +the user interface from being correctly displayed. This should be done even if
> >>>>> +the app is correct but happens to trigger some bug in the hardware/driver.  
> >>>>
> >>>> I still don't think it's good to let the kernel arbitrarily kill
> >>>> processes that it thinks are not well-behaved based on some heuristics
> >>>> and policy.
> >>>>
> >>>> Can't this be outsourced to user space? Expose the information about
> >>>> processes causing a device and let e.g. systemd deal with coming up
> >>>> with a policy and with killing stuff.  
> >>>
> >>> I don't think it's the kernel doing the killing, it would be the UMD.
> >>> E.g., if the app is guilty and doesn't support robustness the UMD can
> >>> just call exit().  
> >>
> >> It would be safer to just ignore API calls[0], similarly to what
> >> is done until the application destroys the context with
> >> robustness. Calling exit() likely results in losing any unsaved
> >> work, whereas at least some applications might otherwise allow
> >> saving the work by other means.  
> > 
> > That's a terrible idea. Ignoring API calls would be identical to a
> > freeze. You might as well disable GPU recovery because the result
> > would be the same.  
> 
> No GPU recovery would affect everything using the GPU, whereas this
> affects only non-robust applications.
> 
> 
> > - non-robust contexts: call exit(1) immediately, which is the best
> > way to recover  
> 
> That's not the UMD's call to make.
> 
> 
> >>     [0] Possibly accompanied by a one-time message to stderr along
> >> the lines of "GPU reset detected but robustness not enabled in
> >> context, ignoring OpenGL API calls".  
> 

Hi,

Michel does have a point. It's not just games and display servers that
use GPU, but productivity tools as well. They may have periodic
autosave in anticipation of crashes, but being able to do the final
save before quitting would be nice. UMD killing the process would be
new behaviour, right? Previously either application's GPU thread hangs
or various API calls return errors, but it didn't kill the process, did
it?

If an application freezes, that's "no problem"; the end user can just
continue using everything else. Alt-tab away etc. if the app was
fullscreen. I do that already with games on even Xorg.

If a display server freezes, that's a desktop-wide problem, but so is
killing it.

OTOH, if UMD really does need to terminate the process, then please do
it in a way that causes a crash report to be recorded. _exit() with an
error code is not it.


Thanks,
pq

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2023-07-03  8:50 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-27 13:23 [PATCH v5 1/1] drm/doc: Document DRM device reset expectations André Almeida
2023-06-27 16:09 ` Randy Dunlap
2023-06-27 17:47 ` Christian König
2023-06-27 21:17   ` André Almeida
2023-06-29 13:11     ` André Almeida
     [not found] ` <CAAxE2A4Hquz9bJNSEaUtBoJC3qbLBPYXd8i3JX9AhNUx_iUKpg@mail.gmail.com>
2023-06-27 21:31   ` André Almeida
2023-06-30 14:48 ` Sebastian Wick
2023-06-30 14:59   ` Alex Deucher
2023-06-30 15:11     ` Michel Dänzer
     [not found]       ` <CAAxE2A5C96k5ua+r938VA_+w7gHHNTdF3n8LwDb98W0Bf9wCVA@mail.gmail.com>
2023-07-03  7:12         ` Michel Dänzer
2023-07-03  8:49           ` Pekka Paalanen [this message]
2023-07-03 15:00             ` André Almeida
2023-07-04  7:42               ` Pekka Paalanen
     [not found]           ` <CAAxE2A7RGDY4eRC85CsqfszNzyKvMU2MX1wa+3HZ1hgNeAw3cQ@mail.gmail.com>
2023-07-04  2:38             ` Randy Dunlap
     [not found]               ` <CAAxE2A5UizddTTBWtuL480bDxgniVcBq7fjRGQhoC-5FG9vKpA@mail.gmail.com>
2023-07-04  2:48                 ` Randy Dunlap
2023-07-04  7:54             ` Michel Dänzer
     [not found]               ` <CAAxE2A7tNCWkL_M2YcE=RN+nqqcokgBR4hcD2sR3fGAY2t4uLg@mail.gmail.com>
2023-07-05  7:32                 ` Michel Dänzer
2023-07-05 15:53                   ` Marek Olšák
2023-06-30 15:21     ` Sebastian Wick
2023-07-25  2:55 ` Non-robust apps and resets (was Re: [PATCH v5 1/1] drm/doc: Document DRM device reset expectations) André Almeida
2023-07-25  7:02   ` Simon Ser
2023-07-25  8:03   ` Michel Dänzer
2023-07-25 13:02     ` André Almeida
2023-07-26  8:07       ` Michel Dänzer
2023-08-02  7:38         ` Marek Olšák
2023-08-02  8:34           ` Michel Dänzer
2023-07-25 15:05     ` Marek Olšák
2023-07-25 17:00       ` Michel Dänzer
2023-07-26  7:55         ` Timur Kristóf
2023-08-04 13:03 ` [PATCH v5 1/1] drm/doc: Document DRM device reset expectations Daniel Vetter
2023-08-08 12:13   ` Sebastian Wick
2023-08-08 17:03     ` Marek Olšák
2023-08-09  7:35       ` Michel Dänzer
2023-08-09 19:15         ` Marek Olšák
2023-08-10  7:33           ` Michel Dänzer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230703114949.796c7498@eldfell \
    --to=ppaalanen@gmail.com \
    --cc=alexander.deucher@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=andrealmeid@igalia.com \
    --cc=christian.koenig@amd.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=kernel-dev@igalia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maraeo@gmail.com \
    --cc=michel.daenzer@mailbox.org \
    --cc=pierre-eric.pelloux-prayer@amd.com \
    --cc=rdunlap@infradead.org \
    --cc=samuel.pitoiset@gmail.com \
    --cc=sebastian.wick@redhat.com \
    --cc=timur.kristof@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox