From: khsieh@codeaurora.org
To: Douglas Anderson <dianders@chromium.org>
Cc: Rob Clark <robdclark@gmail.com>, Daniel Vetter <daniel@ffwll.ch>,
David Airlie <airlied@linux.ie>, Sean Paul <sean@poorly.run>,
Stephen Boyd <swboyd@chromium.org>,
dri-devel@lists.freedesktop.org, freedreno@lists.freedesktop.org,
linux-arm-msm@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] drm/msm/dp: Avoid unpowered AUX xfers that caused crashes
Date: Tue, 09 Nov 2021 14:46:01 -0800 [thread overview]
Message-ID: <bc63c12983fd05bfb97e9a2ce00b0bd6@codeaurora.org> (raw)
In-Reply-To: <20211109100403.1.I4e23470d681f7efe37e2e7f1a6466e15e9bb1d72@changeid>
On 2021-11-09 10:04, Douglas Anderson wrote:
> If you happened to try to access `/dev/drm_dp_aux` devices provided by
> the MSM DP AUX driver too early at bootup you could go boom. Let's
> avoid that by only allowing AUX transfers when the controller is
> powered up.
>
> Specifically the crash that was seen (on Chrome OS 5.4 tree with
> relevant backports):
> Kernel panic - not syncing: Asynchronous SError Interrupt
> CPU: 0 PID: 3131 Comm: fwupd Not tainted 5.4.144-16620-g28af11b73efb
> #1
> Hardware name: Google Lazor (rev3+) with KB Backlight (DT)
> Call trace:
> dump_backtrace+0x0/0x14c
> show_stack+0x20/0x2c
> dump_stack+0xac/0x124
> panic+0x150/0x390
> nmi_panic+0x80/0x94
> arm64_serror_panic+0x78/0x84
> do_serror+0x0/0x118
> do_serror+0xa4/0x118
> el1_error+0xbc/0x160
> dp_catalog_aux_write_data+0x1c/0x3c
> dp_aux_cmd_fifo_tx+0xf0/0x1b0
> dp_aux_transfer+0x1b0/0x2bc
> drm_dp_dpcd_access+0x8c/0x11c
> drm_dp_dpcd_read+0x64/0x10c
> auxdev_read_iter+0xd4/0x1c4
>
> I did a little bit of tracing and found that:
> * We register the AUX device very early at bootup.
> * Power isn't actually turned on for my system until
> hpd_event_thread() -> dp_display_host_init() -> dp_power_init()
> * You can see that dp_power_init() calls dp_aux_init() which is where
> we start allowing AUX channel requests to go through.
>
> In general this patch is a bit of a bandaid but at least it gets us
> out of the current state where userspace acting at the wrong time can
> fully crash the system.
> * I think the more proper fix (which requires quite a bit more
> changes) is to power stuff on while an AUX transfer is
> happening. This is like the solution we did for ti-sn65dsi86. This
> might be required for us to move to populating the panel via the
> DP-AUX bus.
> * Another fix considered was to dynamically register / unregister. I
> tried that at <https://crrev.com/c/3169431/3> but it got
> ugly. Currently there's a bug where the pm_runtime() state isn't
> tracked properly and that causes us to just keep registering more
> and more.
>
> Signed-off-by: Douglas Anderson <dianders@chromium.org>
> ---
Reviewed-by: Kuogee Hsieh <quic_khsieh@quicinc.com>
>
> drivers/gpu/drm/msm/dp/dp_aux.c | 17 +++++++++++++++++
> 1 file changed, 17 insertions(+)
>
> diff --git a/drivers/gpu/drm/msm/dp/dp_aux.c
> b/drivers/gpu/drm/msm/dp/dp_aux.c
> index eb40d8413bca..6d36f63c3338 100644
> --- a/drivers/gpu/drm/msm/dp/dp_aux.c
> +++ b/drivers/gpu/drm/msm/dp/dp_aux.c
> @@ -33,6 +33,7 @@ struct dp_aux_private {
> bool read;
> bool no_send_addr;
> bool no_send_stop;
> + bool initted;
> u32 offset;
> u32 segment;
>
> @@ -331,6 +332,10 @@ static ssize_t dp_aux_transfer(struct drm_dp_aux
> *dp_aux,
> }
>
> mutex_lock(&aux->mutex);
> + if (!aux->initted) {
> + ret = -EIO;
> + goto exit;
> + }
>
> dp_aux_update_offset_and_segment(aux, msg);
> dp_aux_transfer_helper(aux, msg, true);
> @@ -380,6 +385,8 @@ static ssize_t dp_aux_transfer(struct drm_dp_aux
> *dp_aux,
> }
>
> aux->cmd_busy = false;
> +
> +exit:
> mutex_unlock(&aux->mutex);
>
> return ret;
> @@ -431,8 +438,13 @@ void dp_aux_init(struct drm_dp_aux *dp_aux)
>
> aux = container_of(dp_aux, struct dp_aux_private, dp_aux);
>
> + mutex_lock(&aux->mutex);
> +
> dp_catalog_aux_enable(aux->catalog, true);
> aux->retry_cnt = 0;
> + aux->initted = true;
> +
> + mutex_unlock(&aux->mutex);
> }
>
> void dp_aux_deinit(struct drm_dp_aux *dp_aux)
> @@ -441,7 +453,12 @@ void dp_aux_deinit(struct drm_dp_aux *dp_aux)
>
> aux = container_of(dp_aux, struct dp_aux_private, dp_aux);
>
> + mutex_lock(&aux->mutex);
> +
> + aux->initted = false;
> dp_catalog_aux_enable(aux->catalog, false);
> +
> + mutex_unlock(&aux->mutex);
> }
>
> int dp_aux_register(struct drm_dp_aux *dp_aux)
WARNING: multiple messages have this Message-ID (diff)
From: khsieh@codeaurora.org
To: Douglas Anderson <dianders@chromium.org>
Cc: David Airlie <airlied@linux.ie>,
freedreno@lists.freedesktop.org, linux-kernel@vger.kernel.org,
dri-devel@lists.freedesktop.org,
Stephen Boyd <swboyd@chromium.org>,
linux-arm-msm@vger.kernel.org, Sean Paul <sean@poorly.run>
Subject: Re: [PATCH] drm/msm/dp: Avoid unpowered AUX xfers that caused crashes
Date: Tue, 09 Nov 2021 14:46:01 -0800 [thread overview]
Message-ID: <bc63c12983fd05bfb97e9a2ce00b0bd6@codeaurora.org> (raw)
In-Reply-To: <20211109100403.1.I4e23470d681f7efe37e2e7f1a6466e15e9bb1d72@changeid>
On 2021-11-09 10:04, Douglas Anderson wrote:
> If you happened to try to access `/dev/drm_dp_aux` devices provided by
> the MSM DP AUX driver too early at bootup you could go boom. Let's
> avoid that by only allowing AUX transfers when the controller is
> powered up.
>
> Specifically the crash that was seen (on Chrome OS 5.4 tree with
> relevant backports):
> Kernel panic - not syncing: Asynchronous SError Interrupt
> CPU: 0 PID: 3131 Comm: fwupd Not tainted 5.4.144-16620-g28af11b73efb
> #1
> Hardware name: Google Lazor (rev3+) with KB Backlight (DT)
> Call trace:
> dump_backtrace+0x0/0x14c
> show_stack+0x20/0x2c
> dump_stack+0xac/0x124
> panic+0x150/0x390
> nmi_panic+0x80/0x94
> arm64_serror_panic+0x78/0x84
> do_serror+0x0/0x118
> do_serror+0xa4/0x118
> el1_error+0xbc/0x160
> dp_catalog_aux_write_data+0x1c/0x3c
> dp_aux_cmd_fifo_tx+0xf0/0x1b0
> dp_aux_transfer+0x1b0/0x2bc
> drm_dp_dpcd_access+0x8c/0x11c
> drm_dp_dpcd_read+0x64/0x10c
> auxdev_read_iter+0xd4/0x1c4
>
> I did a little bit of tracing and found that:
> * We register the AUX device very early at bootup.
> * Power isn't actually turned on for my system until
> hpd_event_thread() -> dp_display_host_init() -> dp_power_init()
> * You can see that dp_power_init() calls dp_aux_init() which is where
> we start allowing AUX channel requests to go through.
>
> In general this patch is a bit of a bandaid but at least it gets us
> out of the current state where userspace acting at the wrong time can
> fully crash the system.
> * I think the more proper fix (which requires quite a bit more
> changes) is to power stuff on while an AUX transfer is
> happening. This is like the solution we did for ti-sn65dsi86. This
> might be required for us to move to populating the panel via the
> DP-AUX bus.
> * Another fix considered was to dynamically register / unregister. I
> tried that at <https://crrev.com/c/3169431/3> but it got
> ugly. Currently there's a bug where the pm_runtime() state isn't
> tracked properly and that causes us to just keep registering more
> and more.
>
> Signed-off-by: Douglas Anderson <dianders@chromium.org>
> ---
Reviewed-by: Kuogee Hsieh <quic_khsieh@quicinc.com>
>
> drivers/gpu/drm/msm/dp/dp_aux.c | 17 +++++++++++++++++
> 1 file changed, 17 insertions(+)
>
> diff --git a/drivers/gpu/drm/msm/dp/dp_aux.c
> b/drivers/gpu/drm/msm/dp/dp_aux.c
> index eb40d8413bca..6d36f63c3338 100644
> --- a/drivers/gpu/drm/msm/dp/dp_aux.c
> +++ b/drivers/gpu/drm/msm/dp/dp_aux.c
> @@ -33,6 +33,7 @@ struct dp_aux_private {
> bool read;
> bool no_send_addr;
> bool no_send_stop;
> + bool initted;
> u32 offset;
> u32 segment;
>
> @@ -331,6 +332,10 @@ static ssize_t dp_aux_transfer(struct drm_dp_aux
> *dp_aux,
> }
>
> mutex_lock(&aux->mutex);
> + if (!aux->initted) {
> + ret = -EIO;
> + goto exit;
> + }
>
> dp_aux_update_offset_and_segment(aux, msg);
> dp_aux_transfer_helper(aux, msg, true);
> @@ -380,6 +385,8 @@ static ssize_t dp_aux_transfer(struct drm_dp_aux
> *dp_aux,
> }
>
> aux->cmd_busy = false;
> +
> +exit:
> mutex_unlock(&aux->mutex);
>
> return ret;
> @@ -431,8 +438,13 @@ void dp_aux_init(struct drm_dp_aux *dp_aux)
>
> aux = container_of(dp_aux, struct dp_aux_private, dp_aux);
>
> + mutex_lock(&aux->mutex);
> +
> dp_catalog_aux_enable(aux->catalog, true);
> aux->retry_cnt = 0;
> + aux->initted = true;
> +
> + mutex_unlock(&aux->mutex);
> }
>
> void dp_aux_deinit(struct drm_dp_aux *dp_aux)
> @@ -441,7 +453,12 @@ void dp_aux_deinit(struct drm_dp_aux *dp_aux)
>
> aux = container_of(dp_aux, struct dp_aux_private, dp_aux);
>
> + mutex_lock(&aux->mutex);
> +
> + aux->initted = false;
> dp_catalog_aux_enable(aux->catalog, false);
> +
> + mutex_unlock(&aux->mutex);
> }
>
> int dp_aux_register(struct drm_dp_aux *dp_aux)
next prev parent reply other threads:[~2021-11-09 22:47 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-11-09 18:04 [PATCH] drm/msm/dp: Avoid unpowered AUX xfers that caused crashes Douglas Anderson
2021-11-09 18:04 ` Douglas Anderson
2021-11-09 22:46 ` khsieh [this message]
2021-11-09 22:46 ` khsieh
2021-11-10 1:43 ` [Freedreno] " Abhinav Kumar
2021-11-10 1:43 ` Abhinav Kumar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=bc63c12983fd05bfb97e9a2ce00b0bd6@codeaurora.org \
--to=khsieh@codeaurora.org \
--cc=airlied@linux.ie \
--cc=daniel@ffwll.ch \
--cc=dianders@chromium.org \
--cc=dri-devel@lists.freedesktop.org \
--cc=freedreno@lists.freedesktop.org \
--cc=linux-arm-msm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=robdclark@gmail.com \
--cc=sean@poorly.run \
--cc=swboyd@chromium.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.