From: Boris Brezillon <boris.brezillon@collabora.com>
To: Steven Price <steven.price@arm.com>
Cc: dri-devel@lists.freedesktop.org, stable@vger.kernel.org,
Rob Herring <robh+dt@kernel.org>, Icecream95 <ixn@keemail.me>,
Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>,
Robin Murphy <robin.murphy@arm.com>
Subject: Re: [PATCH 1/2] drm/panfrost: Make sure MMU context lifetime is not bound to panfrost_priv
Date: Wed, 5 Feb 2020 15:01:34 +0100 [thread overview]
Message-ID: <20200205150134.340a72c8@collabora.com> (raw)
In-Reply-To: <b798bc8f-e8a9-01e9-e234-a8fdef290259@arm.com>
On Wed, 5 Feb 2020 13:39:21 +0000
Steven Price <steven.price@arm.com> wrote:
> On 04/02/2020 14:35, Boris Brezillon wrote:
> > Jobs can be in-flight when the file descriptor is closed (either because
> > the process did not terminate properly, or because it didn't wait for
> > all GPU jobs to be finished), and apparently panfrost_job_close() does
> > not cancel already running jobs. Let's refcount the MMU context object
> > so it's lifetime is no longer bound to the FD lifetime and running jobs
> > can finish properly without generating spurious page faults.
>
> Is there any good reason not to just make panfrost_job_close() kill off
> any running jobs?
Nope, I just didn't know how to do that without stopping all other jobs
(should have looked at how mali_kbase is doing that before posting this
patch :)).
> I'm not sure what the benefit is of allowing the jobs
> to still run after the file descriptor has closed.
None that I can think of.
>
> In particular this could cause problems when(/if) Panfrost starts trying
> to deal with "compute" work loads that might have long runtimes. It's
> quite possible to produce a job which never (naturally) exits, currently
> we have a simplistic timeout which kills anything which doesn't complete
> promptly. However there is nothing conceptually wrong with a job which
> takes seconds (or even minutes) to complete.
Absolutely. That was also one of my concerns.
> The hardware has support
> for task switching ('soft stopping') between jobs so this can be done to
> prevent blocking other applications.
Okay. I guess it's implemented in mali_kbase. I'll have a look.
>
> If panfrost_job_close() doesn't kill the jobs then removing the timeouts
> could lead to the situation where there is an 'infinite' job with no
> owner and no way of killing it off. Which doesn't seem like a great
> feature ;)
Didn't know you were planning to remove the timeouts.
>
> Another approach could be simply to silence the page fault output in
> this case - switching the address space to UNMAPPED is actually an
> effective way of killing jobs - at some point I think this was a
> workaround to a hardware bug, but IIRC that was unreleased hardware :)
Okay. I'll check how it's done in mali_kbase.
Thanks for the feedback.
Boris
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
WARNING: multiple messages have this Message-ID (diff)
From: Boris Brezillon <boris.brezillon@collabora.com>
To: Steven Price <steven.price@arm.com>
Cc: Rob Herring <robh+dt@kernel.org>,
Tomeu Vizoso <tomeu@tomeuvizoso.net>,
Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>,
Robin Murphy <robin.murphy@arm.com>, Icecream95 <ixn@keemail.me>,
stable@vger.kernel.org, dri-devel@lists.freedesktop.org
Subject: Re: [PATCH 1/2] drm/panfrost: Make sure MMU context lifetime is not bound to panfrost_priv
Date: Wed, 5 Feb 2020 15:01:34 +0100 [thread overview]
Message-ID: <20200205150134.340a72c8@collabora.com> (raw)
In-Reply-To: <b798bc8f-e8a9-01e9-e234-a8fdef290259@arm.com>
On Wed, 5 Feb 2020 13:39:21 +0000
Steven Price <steven.price@arm.com> wrote:
> On 04/02/2020 14:35, Boris Brezillon wrote:
> > Jobs can be in-flight when the file descriptor is closed (either because
> > the process did not terminate properly, or because it didn't wait for
> > all GPU jobs to be finished), and apparently panfrost_job_close() does
> > not cancel already running jobs. Let's refcount the MMU context object
> > so it's lifetime is no longer bound to the FD lifetime and running jobs
> > can finish properly without generating spurious page faults.
>
> Is there any good reason not to just make panfrost_job_close() kill off
> any running jobs?
Nope, I just didn't know how to do that without stopping all other jobs
(should have looked at how mali_kbase is doing that before posting this
patch :)).
> I'm not sure what the benefit is of allowing the jobs
> to still run after the file descriptor has closed.
None that I can think of.
>
> In particular this could cause problems when(/if) Panfrost starts trying
> to deal with "compute" work loads that might have long runtimes. It's
> quite possible to produce a job which never (naturally) exits, currently
> we have a simplistic timeout which kills anything which doesn't complete
> promptly. However there is nothing conceptually wrong with a job which
> takes seconds (or even minutes) to complete.
Absolutely. That was also one of my concerns.
> The hardware has support
> for task switching ('soft stopping') between jobs so this can be done to
> prevent blocking other applications.
Okay. I guess it's implemented in mali_kbase. I'll have a look.
>
> If panfrost_job_close() doesn't kill the jobs then removing the timeouts
> could lead to the situation where there is an 'infinite' job with no
> owner and no way of killing it off. Which doesn't seem like a great
> feature ;)
Didn't know you were planning to remove the timeouts.
>
> Another approach could be simply to silence the page fault output in
> this case - switching the address space to UNMAPPED is actually an
> effective way of killing jobs - at some point I think this was a
> workaround to a hardware bug, but IIRC that was unreleased hardware :)
Okay. I'll check how it's done in mali_kbase.
Thanks for the feedback.
Boris
next prev parent reply other threads:[~2020-02-05 14:01 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-02-04 14:35 [PATCH 1/2] drm/panfrost: Make sure MMU context lifetime is not bound to panfrost_priv Boris Brezillon
2020-02-04 14:35 ` Boris Brezillon
2020-02-04 14:35 ` [PATCH 2/2] drm/panfrost: Propagate panfrost_fence_create() errors to the scheduler Boris Brezillon
2020-02-04 14:37 ` Alyssa Rosenzweig
2020-02-05 13:47 ` Steven Price
2020-02-05 14:21 ` Boris Brezillon
2020-02-05 14:28 ` Steven Price
2020-02-04 14:47 ` [PATCH 1/2] drm/panfrost: Make sure MMU context lifetime is not bound to panfrost_priv Boris Brezillon
2020-02-04 14:47 ` Boris Brezillon
2020-02-05 13:39 ` Steven Price
2020-02-05 13:39 ` Steven Price
2020-02-05 14:01 ` Boris Brezillon [this message]
2020-02-05 14:01 ` Boris Brezillon
2020-02-05 14:08 ` Steven Price
2020-02-05 14:08 ` Steven Price
2020-02-05 15:45 ` Robin Murphy
2020-02-05 15:45 ` Robin Murphy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200205150134.340a72c8@collabora.com \
--to=boris.brezillon@collabora.com \
--cc=alyssa.rosenzweig@collabora.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=ixn@keemail.me \
--cc=robh+dt@kernel.org \
--cc=robin.murphy@arm.com \
--cc=stable@vger.kernel.org \
--cc=steven.price@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.