dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
From: Lyude Paul <lyude@redhat.com>
To: "Lin, Wayne" <Wayne.Lin@amd.com>,
	"dri-devel@lists.freedesktop.org"
	<dri-devel@lists.freedesktop.org>,
	"nouveau@lists.freedesktop.org" <nouveau@lists.freedesktop.org>,
	"amd-gfx@lists.freedesktop.org" <amd-gfx@lists.freedesktop.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>,
	Jani Nikula <jani.nikula@intel.com>,
	Daniel Vetter <daniel.vetter@ffwll.ch>,
	open list <linux-kernel@vger.kernel.org>,
	"Lakha,  Bhawanpreet" <Bhawanpreet.Lakha@amd.com>,
	David Airlie <airlied@linux.ie>, "Zuo, Jerry" <Jerry.Zuo@amd.com>,
	Sean Paul <sean@poorly.run>
Subject: Re: [RESEND RFC 15/18] drm/display/dp_mst: Skip releasing payloads if last connected port isn't connected
Date: Wed, 10 Aug 2022 18:13:18 -0400	[thread overview]
Message-ID: <ef681a8e89c2f4740141d66dd4a3fcb0ad71ab37.camel@redhat.com> (raw)
In-Reply-To: <SJ0PR12MB5504B1917F4933F1C696FCACFC659@SJ0PR12MB5504.namprd12.prod.outlook.com>

On Wed, 2022-08-10 at 03:28 +0000, Lin, Wayne wrote:
> Hi Lyude,
> Thanks for your time and sorry for late response!
> 
> It's described in 5.6.1.3 of DP spec 2.0: 
> "MST branch device, in addition to waiting for the ACK from its immediate 
> Upstream device, should either wait for the ALLOCATE_PAYLOAD message
> transaction with a PBN value equal to 0 from the MST Source device for 
> de-allocating the time slot assigned to the VC Payload that is routed to the
> unplugged DFP or for 2 seconds, whichever occurs first."

oooh! Thank you for posting this, I totally missed the bit that says "or for 2
seconds, whichever occurs first." That certainly explains a lot.

> 
> > > commit 3769e4c0af5b ("drm/dp_mst: Avoid to mess up payload table by
> > > ports in stale topology") was trying to skip updating payload for a
> > > target which is no longer existing in the current topology rooted at
> > > mgr->mst_primary. I passed "mgr->mst_primary" to
> > > drm_dp_mst_port_downstream_of_branch() previously.
> > > Sorry, I might not fully understand the issue you've seen. Could you
> > > elaborate on this more please?
> > > 
> > > Thanks!
> > 
> > I will have to double check this since it's been a month, but basically - the idea
> > of having the topology references in the first place was to be the one check
> > for figuring out whether something's in a topology or not. I've been thinking
> > of maybe trying to replace it at some point, but I think we'd want to do it all
> > over the helpers instead of just in certain spots.
> > 
> > The other thing I noticed was that when I was rewriting this code, I noticed it
> > seemed a lot like we had misunderstood the issue that was causing leaks in
> > the first place. The BAD_PARAM we noticed indicates the payload we're
> > trying to remove on the other end doesn't exist anymore, meaning the
> > branch device in question got rid of any payloads it had active in response to
> > the CSN. In testing though I found that payloads would be automatically
> > released in situations where the last reachable port was marked as
> > disconnected via a previous CSN, but was still reachable otherwise, and not in
> > any other situation. This also seemed to match up with the excerpts in the DP
> > spec that I found, so I assumed it was probably correct.
> 
> IMHO, the main root cause with the commit 3769e4c0af5b ("drm/dp_mst: Avoid
>  to mess up payload table by ports in stale topology") is like what described in the
> commit message. The problem I encountered was when I unplugged the primary
> mst branch device from the system, upper layer didn't try to  release stale streams
> immediately. Instead, it started to gradually release stale streams when I plugged the
> mst hub back to the system. In that case, if we didn't do the check to see whether
> the current request for deallocating payload is for this time topology instance, 
> i.e. might be for the stale topology before I unplug, this deallocation will mess up
> payload allocation for new topology instance.
> 
> As for the CSN, it's a node broadcast request message and not a path message.
> Referring to 2.14.6.1 of DP 2.0 spec: 
> "If the broadcast message is a node request, only the end devices, DP MST
> Source or Sink devices (or DP MST Branch device if Source/Sink are not plugged),
> process the request."
> IMHO, payload should be controlled by source only, by ALLOCATE_PAYLOAD or
> CLEAR_PAYLAOD_ID_TABLE message.
> 
> > 
> > Also, I think using the DDPS field instead of trying to traverse the topology
> > state (which might not have been fully updated yet in response to CSNs)
> > might be a slightly better idea since DDPS may end up being updated before
> > the port has been removed from our in-memory topology, which is kind of
> 
> Thank you Lyude! Just want to confirm with you the below idea to see if I
> understand it correctly. 
> The flow I thought would be (from Source perspective):
> Receive CSN for notifying disconnection event => update physical topology
> connection status (e.g. DDPS, put topology krefcount..) => send hotplug event to
> userspace => userspace asks deallocating payloads for disconnected stream
> sinks =>  put malloc krefcount of disconnected ports/mstbs  => remove ports/mstb
> from in-memory topology.
> I suppose physical topology connection status is updated before sending hotplug
> event to userspace and the in-memory topology still can be referred for stale 
> connection status before payload deallocation completes, i.e. which will put 
> malloc krefcount to eventually destroy disconnected devices in topology in-memory.
> I mean, ideally, sounds like the topology in-memory should be reliable when
> we send ALLOCATE_PAYLOAD as PBN=0. But I understand it definitely is not the
> case if we have krefcount leak.

mhm, I think you made me realize I'm overthinking this a bit now that I've
seen the excerpt you mentioned above, along with the other excerpt about only
the end devices being involved. The main reason I originally foresaw an issue
with this is because the delay with updating the in-memory topology structure
might put us slightly out of sync with the state of the hub on the other end -
causing the hub to spit out an error.

However - based on the excerpts you mentioned I think what I was seeing was
mainly just the 2 second timeout causing things to be released properly - not
specific behavior based on the location in the topology of the branch that was
just unplugged like I originally assumed. I think in that case it probably
does make more sense to go with your fix, so I'll likely drop this and rework
the topology checks you had into this.

> 
> Appreciate for your time and help Lyude!
> 

no, thank you for your help! :) There aren't a whole ton of people who are
this involved with MST so it's very useful to finally have another pair of
eyes looking at all of this. 

>   

-- 
Cheers,
 Lyude Paul (she/her)
 Software Engineer at Red Hat


  reply	other threads:[~2022-08-10 22:13 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-07 19:29 [RESEND RFC 00/18] drm/display/dp_mst: Drop Radeon MST support, make MST atomic-only Lyude Paul
2022-06-07 19:29 ` [RESEND RFC 01/18] drm/amdgpu/dc/mst: Rename dp_mst_stream_allocation(_table) Lyude Paul
2022-06-07 19:29 ` [RESEND RFC 02/18] drm/amdgpu/dm/mst: Rename get_payload_table() Lyude Paul
2022-06-07 19:29 ` [RESEND RFC 03/18] drm/display/dp_mst: Rename drm_dp_mst_vcpi_allocation Lyude Paul
2022-06-07 19:29 ` [RESEND RFC 04/18] drm/display/dp_mst: Call them time slots, not VCPI slots Lyude Paul
2022-06-15  4:28   ` Lin, Wayne
2022-08-02 21:29     ` Lyude Paul
2022-06-07 19:29 ` [RESEND RFC 05/18] drm/display/dp_mst: Fix confusing docs for drm_dp_atomic_release_time_slots() Lyude Paul
2022-06-07 19:29 ` [RESEND RFC 06/18] drm/display/dp_mst: Add some missing kdocs for atomic MST structs Lyude Paul
2022-06-15  4:43   ` Lin, Wayne
2022-08-08 23:07     ` Lyude Paul
2022-06-07 19:29 ` [RESEND RFC 07/18] drm/display/dp_mst: Add helper for finding payloads in atomic MST state Lyude Paul
2022-06-29 13:22   ` Jani Nikula
2022-06-07 19:29 ` [RESEND RFC 08/18] drm/display/dp_mst: Add nonblocking helpers for DP MST Lyude Paul
2022-06-29 13:30   ` Jani Nikula
2022-06-07 19:29 ` [RESEND RFC 09/18] drm/display/dp_mst: Don't open code modeset checks for releasing time slots Lyude Paul
2022-06-07 19:29 ` [RESEND RFC 10/18] drm/display/dp_mst: Fix modeset tracking in drm_dp_atomic_release_vcpi_slots() Lyude Paul
2022-06-07 19:29 ` [RESEND RFC 11/18] drm/nouveau/kms: Cache DP encoders in nouveau_connector Lyude Paul
2022-06-07 19:29 ` [RESEND RFC 12/18] drm/nouveau/kms: Pull mst state in for all modesets Lyude Paul
2022-06-07 19:29 ` [RESEND RFC 13/18] drm/display/dp_mst: Add helpers for serializing SST <-> MST transitions Lyude Paul
2022-06-07 19:29 ` [RESEND RFC 14/18] drm/display/dp_mst: Drop all ports from topology on CSNs before queueing link address work Lyude Paul
2022-06-07 19:29 ` [RESEND RFC 15/18] drm/display/dp_mst: Skip releasing payloads if last connected port isn't connected Lyude Paul
2022-07-05  8:45   ` Lin, Wayne
2022-08-02 22:12     ` Lyude Paul
2022-08-10  3:28       ` Lin, Wayne
2022-08-10 22:13         ` Lyude Paul [this message]
2022-06-07 19:29 ` [RESEND RFC 16/18] drm/display/dp_mst: Maintain time slot allocations when deleting payloads Lyude Paul
2022-06-07 19:29 ` [RESEND RFC 17/18] drm/radeon: Drop legacy MST support Lyude Paul
2022-06-07 20:48   ` Alex Deucher
2022-06-07 19:29 ` [RESEND RFC 18/18] drm/display/dp_mst: Move all payload info into the atomic state Lyude Paul
2022-07-05  9:10   ` Lin, Wayne
2022-07-06 21:57     ` Lyude Paul
2022-08-03 20:27     ` Lyude Paul
2022-08-08 10:02       ` Lin, Wayne
2022-08-08 22:53         ` Lyude Paul
2022-06-29 13:33 ` [RESEND RFC 00/18] drm/display/dp_mst: Drop Radeon MST support, make MST atomic-only Jani Nikula
2022-07-28 22:21 ` Lyude Paul

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ef681a8e89c2f4740141d66dd4a3fcb0ad71ab37.camel@redhat.com \
    --to=lyude@redhat.com \
    --cc=Bhawanpreet.Lakha@amd.com \
    --cc=Jerry.Zuo@amd.com \
    --cc=Wayne.Lin@amd.com \
    --cc=airlied@linux.ie \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=daniel.vetter@ffwll.ch \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=jani.nikula@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nouveau@lists.freedesktop.org \
    --cc=sean@poorly.run \
    --cc=tzimmermann@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).