dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
From: Lyude Paul <lyude@redhat.com>
To: "Lin, Wayne" <Wayne.Lin@amd.com>,
	"dri-devel@lists.freedesktop.org"
	<dri-devel@lists.freedesktop.org>,
	"nouveau@lists.freedesktop.org" <nouveau@lists.freedesktop.org>,
	"amd-gfx@lists.freedesktop.org" <amd-gfx@lists.freedesktop.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>,
	Jani Nikula <jani.nikula@intel.com>,
	Daniel Vetter <daniel.vetter@ffwll.ch>,
	open list <linux-kernel@vger.kernel.org>,
	"Lakha,  Bhawanpreet" <Bhawanpreet.Lakha@amd.com>,
	David Airlie <airlied@linux.ie>, "Zuo, Jerry" <Jerry.Zuo@amd.com>,
	Sean Paul <sean@poorly.run>
Subject: Re: [RESEND RFC 15/18] drm/display/dp_mst: Skip releasing payloads if last connected port isn't connected
Date: Tue, 02 Aug 2022 18:12:28 -0400	[thread overview]
Message-ID: <31d47373883e9aabe5bfa7b172e21b84cc6a164d.camel@redhat.com> (raw)
In-Reply-To: <CO6PR12MB54890BFD954BBF578E2ADA67FC819@CO6PR12MB5489.namprd12.prod.outlook.com>

On Tue, 2022-07-05 at 08:45 +0000, Lin, Wayne wrote:
> [Public]
> 
> 
> 
> > -----Original Message-----
> > From: Lyude Paul <lyude@redhat.com>
> > Sent: Wednesday, June 8, 2022 3:30 AM
> > To: dri-devel@lists.freedesktop.org; nouveau@lists.freedesktop.org; amd-
> > gfx@lists.freedesktop.org
> > Cc: Lin, Wayne <Wayne.Lin@amd.com>; Ville Syrjälä
> > <ville.syrjala@linux.intel.com>; Zuo, Jerry <Jerry.Zuo@amd.com>; Jani
> > Nikula
> > <jani.nikula@intel.com>; Imre Deak <imre.deak@intel.com>; Daniel Vetter
> > <daniel.vetter@ffwll.ch>; Sean Paul <sean@poorly.run>; David Airlie
> > <airlied@linux.ie>; Daniel Vetter <daniel@ffwll.ch>; Thomas Zimmermann
> > <tzimmermann@suse.de>; Lakha, Bhawanpreet
> > <Bhawanpreet.Lakha@amd.com>; open list <linux-kernel@vger.kernel.org>
> > Subject: [RESEND RFC 15/18] drm/display/dp_mst: Skip releasing payloads if
> > last connected port isn't connected
> > 
> > In the past, we've ran into strange issues regarding errors in response to
> > trying to destroy payloads after a port has been unplugged. We fixed this
> > back in:
> > 
> > This is intended to replace the workaround that was added here:
> > 
> > commit 3769e4c0af5b ("drm/dp_mst: Avoid to mess up payload table by
> > ports in stale topology")
> > 
> > which was intended fix to some of the payload leaks that were observed
> > before, where we would attempt to determine if the port was still
> > connected to the topology before updating payloads using
> > drm_dp_mst_port_downstream_of_branch. This wasn't a particularly good
> > solution, since one of the points of still having port and mstb validation
> > is to
> > avoid sending messages to newly disconnected branches wherever possible
> > - thus the required use of drm_dp_mst_port_downstream_of_branch
> > would indicate something may be wrong with said validation.
> > 
> > It seems like it may have just been races and luck that made
> > drm_dp_mst_port_downstream_of_branch work however, as while I was
> > trying to figure out the true cause of this issue when removing the legacy
> > MST code - I discovered an important excerpt in section 2.14.2.3.3.6 of
> > the DP
> > 2.0
> > specs:
> > 
> > "BAD_PARAM - This reply is transmitted when a Message Transaction
> > parameter is in error; for example, the next port number is invalid or /no
> > device is connected/ to the port associated with the port number."
> > 
> > Sure enough - removing the calls to
> > drm_dp_mst_port_downstream_of_branch()
> > and instead checking the ->ddps field of the parent port to see whether we
> > should release a given payload or not seems to totally fix the issue. This
> > does
> > actually make sense to me, as it seems the implication is that given a
> > topology where an MSTB is removed, the payload for the MST parent's port
> > will be released automatically if that port is also marked as
> > disconnected.
> > However, if there's another parent in the chain after that which is
> > connected
> > - payloads must be released there with an ALLOCATE_PAYLOAD message.
> > 
> > So, let's do that!
> > 
> > Signed-off-by: Lyude Paul <lyude@redhat.com>
> > Cc: Wayne Lin <Wayne.Lin@amd.com>
> > Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > Cc: Fangzhi Zuo <Jerry.Zuo@amd.com>
> > Cc: Jani Nikula <jani.nikula@intel.com>
> > Cc: Imre Deak <imre.deak@intel.com>
> > Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> > Cc: Sean Paul <sean@poorly.run>
> > ---
> >  drivers/gpu/drm/display/drm_dp_mst_topology.c | 51 +++++++------------
> >  1 file changed, 17 insertions(+), 34 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/display/drm_dp_mst_topology.c
> > b/drivers/gpu/drm/display/drm_dp_mst_topology.c
> > index dd314586bac3..70adb8db4335 100644
> > --- a/drivers/gpu/drm/display/drm_dp_mst_topology.c
> > +++ b/drivers/gpu/drm/display/drm_dp_mst_topology.c
> > @@ -3137,7 +3137,7 @@ static struct drm_dp_mst_port
> > *drm_dp_get_last_connected_port_to_mstb(struct drm  static struct
> > drm_dp_mst_branch *  drm_dp_get_last_connected_port_and_mstb(struct
> > drm_dp_mst_topology_mgr *mgr,
> >                                         struct drm_dp_mst_branch *mstb,
> > -                                       int *port_num)
> > +                                       struct drm_dp_mst_port
> > **last_port)
> >  {
> >         struct drm_dp_mst_branch *rmstb = NULL;
> >         struct drm_dp_mst_port *found_port;
> > @@ -3153,7 +3153,8 @@
> > drm_dp_get_last_connected_port_and_mstb(struct
> > drm_dp_mst_topology_mgr *mgr,
> > 
> >                 if (drm_dp_mst_topology_try_get_mstb(found_port-
> > > parent)) {
> >                         rmstb = found_port->parent;
> > -                       *port_num = found_port->port_num;
> > +                       *last_port = found_port;
> > +                       drm_dp_mst_get_port_malloc(found_port);
> >                 } else {
> >                         /* Search again, starting from this parent */
> >                         mstb = found_port->parent;
> > @@ -3170,7 +3171,7 @@ static int drm_dp_payload_send_msg(struct
> > drm_dp_mst_topology_mgr *mgr,
> >                                    int pbn)
> >  {
> >         struct drm_dp_sideband_msg_tx *txmsg;
> > -       struct drm_dp_mst_branch *mstb;
> > +       struct drm_dp_mst_branch *mstb = NULL;
> >         int ret, port_num;
> >         u8 sinks[DRM_DP_MAX_SDP_STREAMS];
> >         int i;
> > @@ -3178,12 +3179,22 @@ static int drm_dp_payload_send_msg(struct
> > drm_dp_mst_topology_mgr *mgr,
> >         port_num = port->port_num;
> >         mstb = drm_dp_mst_topology_get_mstb_validated(mgr, port-
> > > parent);
> >         if (!mstb) {
> > -               mstb = drm_dp_get_last_connected_port_and_mstb(mgr,
> > -                                                              port-
> > >parent,
> > -                                                              &port_num);
> > +               struct drm_dp_mst_port *rport = NULL;
> > +               bool ddps;
> > 
> > +               mstb = drm_dp_get_last_connected_port_and_mstb(mgr,
> > port->parent,
> > +&rport);
> >                 if (!mstb)
> >                         return -EINVAL;
> > +
> > +               ddps = rport->ddps;
> > +               port_num = rport->port_num;
> > +               drm_dp_mst_put_port_malloc(rport);
> > +
> > +               /* If the port is currently marked as disconnected, don't
> > send
> > a payload message */
> > +               if (!ddps) {
> Hi Lyude,
> 
> Thanks for driving this!
> Shouldn't we still send ALLOCATE_PAYLOAD with PBN 0 to the last connected 
> Port even its peer device is disconnected? We rely on this "path msg" to
> update
> all payload ID tables along the virtual payload channel.
> 

Do you know if there's any devices that break with this change, btw? Would be
super useful to know imho, and if so I might be alright with dropping it
depending on what the answer to the next paragraph is.

> commit 3769e4c0af5b ("drm/dp_mst: Avoid to mess up payload table by
> ports in stale topology") was trying to skip updating payload for a target
> which is
> no longer existing in the current topology rooted at mgr->mst_primary. I
> passed
> "mgr->mst_primary" to drm_dp_mst_port_downstream_of_branch() previously.
> Sorry, I might not fully understand the issue you've seen. Could you
> elaborate on
> this more please?
> 
> Thanks!

I will have to double check this since it's been a month, but basically - the
idea of having the topology references in the first place was to be the one
check for figuring out whether something's in a topology or not. I've been
thinking of maybe trying to replace it at some point, but I think we'd want to
do it all over the helpers instead of just in certain spots.

The other thing I noticed was that when I was rewriting this code, I noticed
it seemed a lot like we had misunderstood the issue that was causing leaks in
the first place. The BAD_PARAM we noticed indicates the payload we're trying
to remove on the other end doesn't exist anymore, meaning the branch device in
question got rid of any payloads it had active in response to the CSN. In
testing though I found that payloads would be automatically released in
situations where the last reachable port was marked as disconnected via a
previous CSN, but was still reachable otherwise, and not in any other
situation. This also seemed to match up with the excerpts in the DP spec that
I found, so I assumed it was probably correct.

Also, I think using the DDPS field instead of trying to traverse the topology
state (which might not have been fully updated yet in response to CSNs) might
be a slightly better idea since DDPS may end up being updated before the port
has been removed from our in-memory topology, which is kind of one of the
reasons I've been considering trying to come up with a better solution then
topology references someday (unfortunately it works 'good enough' for the most
part, so definitely not a priority). This is 100% a guess on my part though.

> > +                       ret = -EINVAL;
> > +                       goto fail_put;
> > +               }
> >         }
> > 
> >         txmsg = kzalloc(sizeof(*txmsg), GFP_KERNEL); @@ -3384,7 +3395,6
> > @@ int drm_dp_update_payload_part1(struct drm_dp_mst_topology_mgr
> > *mgr, int start_s
> >         struct drm_dp_mst_port *port;
> >         int i, j;
> >         int cur_slots = start_slot;
> > -       bool skip;
> > 
> >         mutex_lock(&mgr->payload_lock);
> >         for (i = 0; i < mgr->max_payloads; i++) { @@ -3399,16 +3409,6 @@
> > int
> > drm_dp_update_payload_part1(struct drm_dp_mst_topology_mgr *mgr,
> > int start_s
> >                         port = container_of(vcpi, struct drm_dp_mst_port,
> >                                             vcpi);
> > 
> > -                       mutex_lock(&mgr->lock);
> > -                       skip
> > = !drm_dp_mst_port_downstream_of_branch(port, mgr->mst_primary);
> > -                       mutex_unlock(&mgr->lock);
> > -
> > -                       if (skip) {
> > -                               drm_dbg_kms(mgr->dev,
> > -                                           "Virtual channel %d is not in
> > current
> > topology\n",
> > -                                           i);
> > -                               continue;
> > -                       }
> >                         /* Validated ports don't matter if we're releasing
> >                          * VCPI
> >                          */
> > @@ -3509,7 +3509,6 @@ int drm_dp_update_payload_part2(struct
> > drm_dp_mst_topology_mgr *mgr)
> >         struct drm_dp_mst_port *port;
> >         int i;
> >         int ret = 0;
> > -       bool skip;
> > 
> >         mutex_lock(&mgr->payload_lock);
> >         for (i = 0; i < mgr->max_payloads; i++) { @@ -3519,13 +3518,6 @@
> > int
> > drm_dp_update_payload_part2(struct drm_dp_mst_topology_mgr *mgr)
> > 
> >                 port = container_of(mgr->proposed_vcpis[i], struct
> > drm_dp_mst_port, vcpi);
> > 
> > -               mutex_lock(&mgr->lock);
> > -               skip = !drm_dp_mst_port_downstream_of_branch(port,
> > mgr->mst_primary);
> > -               mutex_unlock(&mgr->lock);
> > -
> > -               if (skip)
> > -                       continue;
> > -
> >                 drm_dbg_kms(mgr->dev, "payload %d %d\n", i, mgr-
> > > payloads[i].payload_state);
> >                 if (mgr->payloads[i].payload_state == DP_PAYLOAD_LOCAL)
> > {
> >                         ret = drm_dp_create_payload_step2(mgr, port, mgr-
> > > proposed_vcpis[i]->vcpi, &mgr->payloads[i]); @@ -4780,18 +4772,9 @@
> > EXPORT_SYMBOL(drm_dp_mst_reset_vcpi_slots);
> >  void drm_dp_mst_deallocate_vcpi(struct drm_dp_mst_topology_mgr *mgr,
> >                                 struct drm_dp_mst_port *port)
> >  {
> > -       bool skip;
> > -
> >         if (!port->vcpi.vcpi)
> >                 return;
> > 
> > -       mutex_lock(&mgr->lock);
> > -       skip = !drm_dp_mst_port_downstream_of_branch(port, mgr-
> > > mst_primary);
> > -       mutex_unlock(&mgr->lock);
> > -
> > -       if (skip)
> > -               return;
> > -
> >         drm_dp_mst_put_payload_id(mgr, port->vcpi.vcpi);
> >         port->vcpi.num_slots = 0;
> >         port->vcpi.pbn = 0;
> > --
> > 2.35.3
> --
> Wayne Lin

-- 
Cheers,
 Lyude Paul (she/her)
 Software Engineer at Red Hat


  reply	other threads:[~2022-08-02 22:13 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-07 19:29 [RESEND RFC 00/18] drm/display/dp_mst: Drop Radeon MST support, make MST atomic-only Lyude Paul
2022-06-07 19:29 ` [RESEND RFC 01/18] drm/amdgpu/dc/mst: Rename dp_mst_stream_allocation(_table) Lyude Paul
2022-06-07 19:29 ` [RESEND RFC 02/18] drm/amdgpu/dm/mst: Rename get_payload_table() Lyude Paul
2022-06-07 19:29 ` [RESEND RFC 03/18] drm/display/dp_mst: Rename drm_dp_mst_vcpi_allocation Lyude Paul
2022-06-07 19:29 ` [RESEND RFC 04/18] drm/display/dp_mst: Call them time slots, not VCPI slots Lyude Paul
2022-06-15  4:28   ` Lin, Wayne
2022-08-02 21:29     ` Lyude Paul
2022-06-07 19:29 ` [RESEND RFC 05/18] drm/display/dp_mst: Fix confusing docs for drm_dp_atomic_release_time_slots() Lyude Paul
2022-06-07 19:29 ` [RESEND RFC 06/18] drm/display/dp_mst: Add some missing kdocs for atomic MST structs Lyude Paul
2022-06-15  4:43   ` Lin, Wayne
2022-08-08 23:07     ` Lyude Paul
2022-06-07 19:29 ` [RESEND RFC 07/18] drm/display/dp_mst: Add helper for finding payloads in atomic MST state Lyude Paul
2022-06-29 13:22   ` Jani Nikula
2022-06-07 19:29 ` [RESEND RFC 08/18] drm/display/dp_mst: Add nonblocking helpers for DP MST Lyude Paul
2022-06-29 13:30   ` Jani Nikula
2022-06-07 19:29 ` [RESEND RFC 09/18] drm/display/dp_mst: Don't open code modeset checks for releasing time slots Lyude Paul
2022-06-07 19:29 ` [RESEND RFC 10/18] drm/display/dp_mst: Fix modeset tracking in drm_dp_atomic_release_vcpi_slots() Lyude Paul
2022-06-07 19:29 ` [RESEND RFC 11/18] drm/nouveau/kms: Cache DP encoders in nouveau_connector Lyude Paul
2022-06-07 19:29 ` [RESEND RFC 12/18] drm/nouveau/kms: Pull mst state in for all modesets Lyude Paul
2022-06-07 19:29 ` [RESEND RFC 13/18] drm/display/dp_mst: Add helpers for serializing SST <-> MST transitions Lyude Paul
2022-06-07 19:29 ` [RESEND RFC 14/18] drm/display/dp_mst: Drop all ports from topology on CSNs before queueing link address work Lyude Paul
2022-06-07 19:29 ` [RESEND RFC 15/18] drm/display/dp_mst: Skip releasing payloads if last connected port isn't connected Lyude Paul
2022-07-05  8:45   ` Lin, Wayne
2022-08-02 22:12     ` Lyude Paul [this message]
2022-08-10  3:28       ` Lin, Wayne
2022-08-10 22:13         ` Lyude Paul
2022-06-07 19:29 ` [RESEND RFC 16/18] drm/display/dp_mst: Maintain time slot allocations when deleting payloads Lyude Paul
2022-06-07 19:29 ` [RESEND RFC 17/18] drm/radeon: Drop legacy MST support Lyude Paul
2022-06-07 20:48   ` Alex Deucher
2022-06-07 19:29 ` [RESEND RFC 18/18] drm/display/dp_mst: Move all payload info into the atomic state Lyude Paul
2022-07-05  9:10   ` Lin, Wayne
2022-07-06 21:57     ` Lyude Paul
2022-08-03 20:27     ` Lyude Paul
2022-08-08 10:02       ` Lin, Wayne
2022-08-08 22:53         ` Lyude Paul
2022-06-29 13:33 ` [RESEND RFC 00/18] drm/display/dp_mst: Drop Radeon MST support, make MST atomic-only Jani Nikula
2022-07-28 22:21 ` Lyude Paul

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=31d47373883e9aabe5bfa7b172e21b84cc6a164d.camel@redhat.com \
    --to=lyude@redhat.com \
    --cc=Bhawanpreet.Lakha@amd.com \
    --cc=Jerry.Zuo@amd.com \
    --cc=Wayne.Lin@amd.com \
    --cc=airlied@linux.ie \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=daniel.vetter@ffwll.ch \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=jani.nikula@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nouveau@lists.freedesktop.org \
    --cc=sean@poorly.run \
    --cc=tzimmermann@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).