dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
From: Philipp Stanner <phasta@mailbox.org>
To: "Christian König" <christian.koenig@amd.com>,
	phasta@kernel.org, alexdeucher@gmail.com, dakr@kernel.org,
	matthew.brost@intel.com, dri-devel@lists.freedesktop.org
Subject: Re: [PATCH] drm/sched: avoid killing parent entity on child SIGKILL v3
Date: Thu, 16 Oct 2025 19:20:00 +0200	[thread overview]
Message-ID: <090dcd8ef59f5c4faa0370669bf69eca6a881634.camel@mailbox.org> (raw)
In-Reply-To: <08c5d03f-d099-43f9-a26b-d333e394d862@amd.com>

On Thu, 2025-10-16 at 15:11 +0200, Christian König wrote:
> On 16.10.25 14:31, Philipp Stanner wrote:
> > On Wed, 2025-10-15 at 16:01 +0200, Christian König wrote:
> > > From: David Rosca <david.rosca@amd.com>
> > > 
> > > The DRM scheduler tracks who last uses an entity and when that process
> > > is killed blocks all further submissions to that entity.
> > > 
> > > The problem is that we didn't track who initially created an entity, so
> > > when a process accidently leaked its file descriptor to a child and
> > > that child got killed, we killed the parent's entities.
> > > 
> > > Avoid that and instead initialize the entities last user on entity
> > > creation. This also allows to drop the extra NULL check.
> > > 
> > > v2: still use cmpxchg
> > > v3: improve the commit message
> > 
> > For the future, commit messages in the patche's comment body are to be
> > preferred since it's common kernel style. Same applies to the patch
> > version in the title, which should be in [PATCH v3].
> 
> Ah, just forgotten about it!
> 
> > 
> > But that's just a nit. More important:
> > 
> > > 
> > > Signed-off-by: David Rosca <david.rosca@amd.com>
> > > Signed-off-by: Christian König <christian.koenig@amd.com>
> > > Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4568
> > 
> > Should this have a Fixes: ?
> 
> No, I've actually removed that because the patch which made it obvious that something is wrong here is correct.
> 
> It's just that this seems to be incorrect ever since we added the code.

Then we should just add the Fixes: tag for the big bang commit,
shouldn't we?

At least maintainer-tools/dim doesn't like the missing tag at all. When
trying to apply this patch it just added the following:

Signed-off-by: David Rosca <david.rosca@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4568
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
CC: stable@vger.kernel.org
Fixes: 43bce41cf48e ("drm/scheduler: only kill entity if last user is  killed v2")
Signed-off-by: Philipp Stanner <phasta@kernel.org>
Link: https://lore.kernel.org/r/20251015140128.1470-1-christian.koenig@amd.com

and then it complains about the tag it added itself:

Applying: drm/sched: avoid killing parent entity on child SIGKILL v3
[drm-misc-fixes 2a3e82c80bd0] drm/sched: avoid killing parent entity on child SIGKILL v3
 Author: David Rosca <david.rosca@amd.com>
 Date: Wed Oct 15 16:01:28 2025 +0200
 1 file changed, 2 insertions(+), 1 deletion(-)
2a3e82c80bd0 (HEAD -> drm-misc-fixes) drm/sched: avoid killing parent entity on child SIGKILL v3
-:27: WARNING:BAD_FIXES_TAG: Please use correct Fixes: style 'Fixes: <12+ chars of sha1> ("<title line>")' - ie: 'Fixes: 43bce41cf48e ("drm/scheduler: only kill entity if last user is killed v2")'
#27: 
Fixes: 43bce41cf48e ("drm/scheduler: only kill entity if last user is  killed v2")

-:48: CHECK:UNNECESSARY_PARENTHESES: Unnecessary parentheses around 'current->exit_code == SIGKILL'
#48: FILE: drivers/gpu/drm/scheduler/sched_entity.c:306:
+	if (last_user == current->group_leader &&
 	    (current->flags & PF_EXITING) && (current->exit_code == SIGKILL))

total: 0 errors, 1 warnings, 1 checks, 15 lines checked


Which is weird..

in any case the big bang commit helps stable to apply this correctly,
too.

P.

> 
> > 
> > > Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
> > > CC: stable@vger.kernel.org
> > 
> > So we want it in drm-misc-fixes, don't we?
> 
> Yes, the patch is based on drm-misc-fixes. I can push it when you give me an rb.
> 
> Alternatively you can push it yourself, whatever you prefer.
> 
> Regards,
> Christian.
> 
> > 
> > 
> > P.
> > 
> > > ---
> > >  drivers/gpu/drm/scheduler/sched_entity.c | 3 ++-
> > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
> > > index 5a4697f636f2..3e2f83dc3f24 100644
> > > --- a/drivers/gpu/drm/scheduler/sched_entity.c
> > > +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> > > @@ -70,6 +70,7 @@ int drm_sched_entity_init(struct drm_sched_entity *entity,
> > >  	entity->guilty = guilty;
> > >  	entity->num_sched_list = num_sched_list;
> > >  	entity->priority = priority;
> > > +	entity->last_user = current->group_leader;
> > >  	/*
> > >  	 * It's perfectly valid to initialize an entity without having a valid
> > >  	 * scheduler attached. It's just not valid to use the scheduler before it
> > > @@ -302,7 +303,7 @@ long drm_sched_entity_flush(struct drm_sched_entity *entity, long timeout)
> > >  
> > >  	/* For a killed process disallow further enqueueing of jobs. */
> > >  	last_user = cmpxchg(&entity->last_user, current->group_leader, NULL);
> > > -	if ((!last_user || last_user == current->group_leader) &&
> > > +	if (last_user == current->group_leader &&
> > >  	    (current->flags & PF_EXITING) && (current->exit_code == SIGKILL))
> > >  		drm_sched_entity_kill(entity);
> > >  
> > 
> 


  reply	other threads:[~2025-10-16 17:20 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-15 14:01 [PATCH] drm/sched: avoid killing parent entity on child SIGKILL v3 Christian König
2025-10-16 12:31 ` Philipp Stanner
2025-10-16 13:11   ` Christian König
2025-10-16 17:20     ` Philipp Stanner [this message]
2025-10-16 17:31       ` Christian König
2025-10-17  6:18 ` Philipp Stanner
2025-10-28 13:07   ` Alex Deucher
2025-10-28 13:13     ` Philipp Stanner
2025-10-29 13:57       ` Christian König

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=090dcd8ef59f5c4faa0370669bf69eca6a881634.camel@mailbox.org \
    --to=phasta@mailbox.org \
    --cc=alexdeucher@gmail.com \
    --cc=christian.koenig@amd.com \
    --cc=dakr@kernel.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=matthew.brost@intel.com \
    --cc=phasta@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).