All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jani Nikula <jani.nikula@linux.intel.com>
To: Lyude Paul <lyude@redhat.com>, Jeff Layton <jlayton@kernel.org>,
	"Lin, Wayne" <Wayne.Lin@amd.com>,
	Alex Deucher <alexdeucher@gmail.com>
Cc: "Deucher, Alexander" <Alexander.Deucher@amd.com>,
	"dri-devel@lists.freedesktop.org"
	<dri-devel@lists.freedesktop.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] drm: make drm_dp_add_payload_part2 gracefully handle NULL state pointer
Date: Mon, 17 Apr 2023 11:44:47 +0300	[thread overview]
Message-ID: <874jpegao0.fsf@intel.com> (raw)
In-Reply-To: <7a1b00f02b125bd65824b18ea09509efe3cf777d.camel@redhat.com>

On Fri, 14 Apr 2023, Lyude Paul <lyude@redhat.com> wrote:
> On Fri, 2023-04-14 at 13:35 +0300, Jani Nikula wrote:
>> On Fri, 14 Apr 2023, Jeff Layton <jlayton@kernel.org> wrote:
>> > On Fri, 2023-04-14 at 04:40 +0000, Lin, Wayne wrote:
>> > > [Public]
>> > > 
>> > > Hi Jeff,
>> > > 
>> > > Thanks. I might need more information to understand why we can't retrieve
>> > > the drm atomic state. Also , "Failed to create MST payload for port" indicates
>> > > error while configuring DPCD payload ID table. Could you help to provide log
>> > > with KMS + ATOMIC + DP debug on please? Thanks in advance!
>> > > 
>> > > Regards,
>> > > Wayne
>> > > 
>> > 
>> > Possibly. I'm not that familiar with display driver debugging. Can you
>> > send me some directions on how to crank up that sort of debug logging?
>> > 
>> > Note that this problem is _very_ intermittent too: I went about 2 weeks
>> > between crashes, and then I got 3 in one day. I'd rather not run with a
>> > lot of debug logging for a long time if that's what this is going to
>> > require, as this is my main workstation.
>> > 
>> > The last time I got this log message, my proposed patch did prevent the
>> > box from oopsing, so I'd really like to see it go in unless it's just
>> > categorically wrong for the caller to pass down a NULL state pointer to
>> > drm_dp_add_payload_part2.
>> 
>> Cc: Lyude.
>> 
>> Looks like the state parameter was added in commit 4d07b0bc4034
>> ("drm/display/dp_mst: Move all payload info into the atomic state") and
>> its only use is to get at state->dev for debug logging.
>> 
>> What's the plan for the parameter? Surely something more than that! :)
>
> I don't think there was any plan for that, or at least I certainly don't even
> remember adding that D:. It must totally have been by mistake and snuck by
> review, if that's the only thing that we're using it for I'd say it's
> definitely fine to just drop it entirely

I guess we could use two patches then, first replace state->dev with
mgr->dev as something that can be backported as needed, and second drop
the state parameter altogether.

Jeff, up for it? At least the first one?


BR,
Jani.


>
>> 
>> Instead of "state ? state->dev : NULL" I guess we could use mgr->dev
>> like the other logging calls do. It's papering over the NULL parameter
>> too, but perhaps in a slightly cleaner way...
>> 
>> 
>> BR,
>> Jani.
>> 
>> 
>> > 
>> > > > -----Original Message-----
>> > > > From: Alex Deucher <alexdeucher@gmail.com>
>> > > > Sent: Thursday, April 13, 2023 8:59 PM
>> > > > To: Jani Nikula <jani.nikula@linux.intel.com>; Lin, Wayne
>> > > > <Wayne.Lin@amd.com>
>> > > > Cc: Jeff Layton <jlayton@kernel.org>; David Airlie <airlied@gmail.com>;
>> > > > Daniel Vetter <daniel@ffwll.ch>; Deucher, Alexander
>> > > > <Alexander.Deucher@amd.com>; linux-kernel@vger.kernel.org; dri-
>> > > > devel@lists.freedesktop.org
>> > > > Subject: Re: [PATCH] drm: make drm_dp_add_payload_part2 gracefully
>> > > > handle NULL state pointer
>> > > > 
>> > > > + Wayne
>> > > > 
>> > > > On Thu, Apr 13, 2023 at 8:31 AM Jani Nikula <jani.nikula@linux.intel.com>
>> > > > wrote:
>> > > > > 
>> > > > > On Thu, 13 Apr 2023, Jeff Layton <jlayton@kernel.org> wrote:
>> > > > > > I've been experiencing some intermittent crashes down in the display
>> > > > > > driver code. The symptoms are ususally a line like this in dmesg:
>> > > > > > 
>> > > > > >     amdgpu 0000:30:00.0: [drm] Failed to create MST payload for port
>> > > > > > 000000006d3a3885: -5
>> > > > > > 
>> > > > > > ...followed by an Oops due to a NULL pointer dereference.
>> > > > > > 
>> > > > > > The real bug is probably in the caller of this function, which is
>> > > > > > passing it a NULL state pointer, but this patch at least keeps my
>> > > > > > machine from oopsing when this occurs.
>> > > > > 
>> > > > > My fear is that papering over this makes the root cause harder to find.
>> > > > > 
>> > > > > Cc: Harry, Alex
>> > > > > 
>> > > > > 
>> > > > > BR,
>> > > > > Jani.
>> > > > > 
>> > > > > 
>> > > > > > 
>> > > > > > Link: https://bugzilla.redhat.com/show_bug.cgi?id=2184855
>> > > > > > Signed-off-by: Jeff Layton <jlayton@kernel.org>
>> > > > > > ---
>> > > > > >  drivers/gpu/drm/display/drm_dp_mst_topology.c | 3 ++-
>> > > > > >  1 file changed, 2 insertions(+), 1 deletion(-)
>> > > > > > 
>> > > > > > diff --git a/drivers/gpu/drm/display/drm_dp_mst_topology.c
>> > > > > > b/drivers/gpu/drm/display/drm_dp_mst_topology.c
>> > > > > > index 38dab76ae69e..87ad406c50f9 100644
>> > > > > > --- a/drivers/gpu/drm/display/drm_dp_mst_topology.c
>> > > > > > +++ b/drivers/gpu/drm/display/drm_dp_mst_topology.c
>> > > > > > @@ -3404,7 +3404,8 @@ int drm_dp_add_payload_part2(struct
>> > > > > > drm_dp_mst_topology_mgr *mgr,
>> > > > > > 
>> > > > > >       /* Skip failed payloads */
>> > > > > >       if (payload->vc_start_slot == -1) {
>> > > > > > -             drm_dbg_kms(state->dev, "Part 1 of payload creation for %s
>> > > > failed, skipping part 2\n",
>> > > > > > +             drm_dbg_kms(state ? state->dev : NULL,
>> > > > > > +                         "Part 1 of payload creation for %s failed,
>> > > > > > + skipping part 2\n",
>> > > > > >                           payload->port->connector->name);
>> > > > > >               return -EIO;
>> > > > > >       }
>> > > > > 
>> > > > > --
>> > > > > Jani Nikula, Intel Open Source Graphics Center
>> 

-- 
Jani Nikula, Intel Open Source Graphics Center

WARNING: multiple messages have this Message-ID (diff)
From: Jani Nikula <jani.nikula@linux.intel.com>
To: Lyude Paul <lyude@redhat.com>, Jeff Layton <jlayton@kernel.org>,
	"Lin, Wayne" <Wayne.Lin@amd.com>,
	Alex Deucher <alexdeucher@gmail.com>
Cc: David Airlie <airlied@gmail.com>, Daniel Vetter <daniel@ffwll.ch>,
	"Deucher, Alexander" <Alexander.Deucher@amd.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"dri-devel@lists.freedesktop.org"
	<dri-devel@lists.freedesktop.org>
Subject: Re: [PATCH] drm: make drm_dp_add_payload_part2 gracefully handle NULL state pointer
Date: Mon, 17 Apr 2023 11:44:47 +0300	[thread overview]
Message-ID: <874jpegao0.fsf@intel.com> (raw)
In-Reply-To: <7a1b00f02b125bd65824b18ea09509efe3cf777d.camel@redhat.com>

On Fri, 14 Apr 2023, Lyude Paul <lyude@redhat.com> wrote:
> On Fri, 2023-04-14 at 13:35 +0300, Jani Nikula wrote:
>> On Fri, 14 Apr 2023, Jeff Layton <jlayton@kernel.org> wrote:
>> > On Fri, 2023-04-14 at 04:40 +0000, Lin, Wayne wrote:
>> > > [Public]
>> > > 
>> > > Hi Jeff,
>> > > 
>> > > Thanks. I might need more information to understand why we can't retrieve
>> > > the drm atomic state. Also , "Failed to create MST payload for port" indicates
>> > > error while configuring DPCD payload ID table. Could you help to provide log
>> > > with KMS + ATOMIC + DP debug on please? Thanks in advance!
>> > > 
>> > > Regards,
>> > > Wayne
>> > > 
>> > 
>> > Possibly. I'm not that familiar with display driver debugging. Can you
>> > send me some directions on how to crank up that sort of debug logging?
>> > 
>> > Note that this problem is _very_ intermittent too: I went about 2 weeks
>> > between crashes, and then I got 3 in one day. I'd rather not run with a
>> > lot of debug logging for a long time if that's what this is going to
>> > require, as this is my main workstation.
>> > 
>> > The last time I got this log message, my proposed patch did prevent the
>> > box from oopsing, so I'd really like to see it go in unless it's just
>> > categorically wrong for the caller to pass down a NULL state pointer to
>> > drm_dp_add_payload_part2.
>> 
>> Cc: Lyude.
>> 
>> Looks like the state parameter was added in commit 4d07b0bc4034
>> ("drm/display/dp_mst: Move all payload info into the atomic state") and
>> its only use is to get at state->dev for debug logging.
>> 
>> What's the plan for the parameter? Surely something more than that! :)
>
> I don't think there was any plan for that, or at least I certainly don't even
> remember adding that D:. It must totally have been by mistake and snuck by
> review, if that's the only thing that we're using it for I'd say it's
> definitely fine to just drop it entirely

I guess we could use two patches then, first replace state->dev with
mgr->dev as something that can be backported as needed, and second drop
the state parameter altogether.

Jeff, up for it? At least the first one?


BR,
Jani.


>
>> 
>> Instead of "state ? state->dev : NULL" I guess we could use mgr->dev
>> like the other logging calls do. It's papering over the NULL parameter
>> too, but perhaps in a slightly cleaner way...
>> 
>> 
>> BR,
>> Jani.
>> 
>> 
>> > 
>> > > > -----Original Message-----
>> > > > From: Alex Deucher <alexdeucher@gmail.com>
>> > > > Sent: Thursday, April 13, 2023 8:59 PM
>> > > > To: Jani Nikula <jani.nikula@linux.intel.com>; Lin, Wayne
>> > > > <Wayne.Lin@amd.com>
>> > > > Cc: Jeff Layton <jlayton@kernel.org>; David Airlie <airlied@gmail.com>;
>> > > > Daniel Vetter <daniel@ffwll.ch>; Deucher, Alexander
>> > > > <Alexander.Deucher@amd.com>; linux-kernel@vger.kernel.org; dri-
>> > > > devel@lists.freedesktop.org
>> > > > Subject: Re: [PATCH] drm: make drm_dp_add_payload_part2 gracefully
>> > > > handle NULL state pointer
>> > > > 
>> > > > + Wayne
>> > > > 
>> > > > On Thu, Apr 13, 2023 at 8:31 AM Jani Nikula <jani.nikula@linux.intel.com>
>> > > > wrote:
>> > > > > 
>> > > > > On Thu, 13 Apr 2023, Jeff Layton <jlayton@kernel.org> wrote:
>> > > > > > I've been experiencing some intermittent crashes down in the display
>> > > > > > driver code. The symptoms are ususally a line like this in dmesg:
>> > > > > > 
>> > > > > >     amdgpu 0000:30:00.0: [drm] Failed to create MST payload for port
>> > > > > > 000000006d3a3885: -5
>> > > > > > 
>> > > > > > ...followed by an Oops due to a NULL pointer dereference.
>> > > > > > 
>> > > > > > The real bug is probably in the caller of this function, which is
>> > > > > > passing it a NULL state pointer, but this patch at least keeps my
>> > > > > > machine from oopsing when this occurs.
>> > > > > 
>> > > > > My fear is that papering over this makes the root cause harder to find.
>> > > > > 
>> > > > > Cc: Harry, Alex
>> > > > > 
>> > > > > 
>> > > > > BR,
>> > > > > Jani.
>> > > > > 
>> > > > > 
>> > > > > > 
>> > > > > > Link: https://bugzilla.redhat.com/show_bug.cgi?id=2184855
>> > > > > > Signed-off-by: Jeff Layton <jlayton@kernel.org>
>> > > > > > ---
>> > > > > >  drivers/gpu/drm/display/drm_dp_mst_topology.c | 3 ++-
>> > > > > >  1 file changed, 2 insertions(+), 1 deletion(-)
>> > > > > > 
>> > > > > > diff --git a/drivers/gpu/drm/display/drm_dp_mst_topology.c
>> > > > > > b/drivers/gpu/drm/display/drm_dp_mst_topology.c
>> > > > > > index 38dab76ae69e..87ad406c50f9 100644
>> > > > > > --- a/drivers/gpu/drm/display/drm_dp_mst_topology.c
>> > > > > > +++ b/drivers/gpu/drm/display/drm_dp_mst_topology.c
>> > > > > > @@ -3404,7 +3404,8 @@ int drm_dp_add_payload_part2(struct
>> > > > > > drm_dp_mst_topology_mgr *mgr,
>> > > > > > 
>> > > > > >       /* Skip failed payloads */
>> > > > > >       if (payload->vc_start_slot == -1) {
>> > > > > > -             drm_dbg_kms(state->dev, "Part 1 of payload creation for %s
>> > > > failed, skipping part 2\n",
>> > > > > > +             drm_dbg_kms(state ? state->dev : NULL,
>> > > > > > +                         "Part 1 of payload creation for %s failed,
>> > > > > > + skipping part 2\n",
>> > > > > >                           payload->port->connector->name);
>> > > > > >               return -EIO;
>> > > > > >       }
>> > > > > 
>> > > > > --
>> > > > > Jani Nikula, Intel Open Source Graphics Center
>> 

-- 
Jani Nikula, Intel Open Source Graphics Center

  reply	other threads:[~2023-04-17  8:44 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-13 11:12 [PATCH] drm: make drm_dp_add_payload_part2 gracefully handle NULL state pointer Jeff Layton
2023-04-13 11:12 ` Jeff Layton
2023-04-13 12:31 ` Jani Nikula
2023-04-13 12:31   ` Jani Nikula
2023-04-13 12:43   ` Jeff Layton
2023-04-13 12:43     ` Jeff Layton
2023-04-13 12:58   ` Alex Deucher
2023-04-13 12:58     ` Alex Deucher
2023-04-14  4:40     ` Lin, Wayne
2023-04-14  4:40       ` Lin, Wayne
2023-04-14 10:15       ` Jeff Layton
2023-04-14 10:15         ` Jeff Layton
2023-04-14 10:35         ` Jani Nikula
2023-04-14 10:35           ` Jani Nikula
2023-04-14 22:51           ` Lyude Paul
2023-04-14 22:51             ` Lyude Paul
2023-04-17  8:44             ` Jani Nikula [this message]
2023-04-17  8:44               ` Jani Nikula
2023-04-17 10:06               ` Jeff Layton
2023-04-17 10:06                 ` Jeff Layton
2023-04-17 10:29                 ` Jani Nikula
2023-04-17 10:29                   ` Jani Nikula
2023-04-17 10:58                   ` Lin, Wayne
2023-04-17 10:58                     ` Lin, Wayne
2023-04-17 11:13                     ` Jeff Layton
2023-04-17 11:13                       ` Jeff Layton
2023-04-17 11:02                   ` Jeff Layton
2023-04-17 11:02                     ` Jeff Layton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=874jpegao0.fsf@intel.com \
    --to=jani.nikula@linux.intel.com \
    --cc=Alexander.Deucher@amd.com \
    --cc=Wayne.Lin@amd.com \
    --cc=alexdeucher@gmail.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=jlayton@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lyude@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.