From: Prasanna Kumar T S M <ptsm@linux.microsoft.com>
To: Borislav Petkov <bp@alien8.de>
Cc: ssengar@linux.microsoft.com, shubhrajyoti.datta@amd.com,
tony.luck@intel.com, linux-edac@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 1/3] EDAC/versalnet: Fix teardown ordering in mc_remove()
Date: Mon, 6 Apr 2026 10:56:17 +0530 [thread overview]
Message-ID: <b6af31b0-0429-4092-8ec0-e0ab84a16f39@linux.microsoft.com> (raw)
In-Reply-To: <20260403103457.GAac-X0SzsO9MtwAVE@fat_crate.local>
On 03-04-2026 16:04, Borislav Petkov wrote:
> On Wed, Apr 01, 2026 at 04:18:36AM -0700, Prasanna Kumar T S M wrote:
>> The teardown sequence in mc_remove() does not mirror the reverse of the
>> initialization order in mc_probe(). In particular,
>> unregister_rpmsg_driver() is called before remove_versalnet(), and
>> cdx_mcdi_finish() is called after rproc_shutdown().
>>
>> Reorder mc_remove() to reverse the probe initialization sequence,
>> consistent with the probe error-unwind paths.
>>
>> The rproc reference acquired via rproc_get_by_phandle() during probe
>> is not released in mc_remove(), causing a reference count leak. Add
>> the missing rproc_put() call.
>>
>> Fixes: d5fe2fec6c40 ("EDAC: Add a driver for the AMD Versal NET DDR controller")
>> Signed-off-by: Prasanna Kumar T S M <ptsm@linux.microsoft.com>
>> ---
>> drivers/edac/versalnet_edac.c | 5 +++--
>> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> Sashiko has found things, pls addres them:
>
> https://sashiko.dev/#/patchset/20260401111836.2342918-1-ptsm%40linux.microsoft.com
>
I asked AI to validate Sashiko's comment. This is its output
------------------------------------------------------------
Analysis
The review comment is not valid (false positive). Here's the detailed
reasoning:
Data structure hierarchy
mc_priv
└── mcdi: struct cdx_mcdi * ← kfree'd at step 6 (LAST)
├── mcdi: struct cdx_mcdi_data * ← kfree'd by
cdx_mcdi_finish() at step 2
│ └── iface: struct cdx_mcdi_iface (mutex, cmd list, etc.)
├── r5_rproc
└── ...
New mc_remove() order (after commit)
(1) remove_versalnet(priv); // tear down EDAC
(2) cdx_mcdi_finish(priv->mcdi); // wait_for_cleanup →
kfree(cdx->mcdi) → cdx->mcdi = NULL
(3) unregister_rpmsg_driver(&amd_rpmsg_driver); // destroys rpmsg
endpoint
(4) rproc_shutdown(priv->mcdi->r5_rproc);
(5) rproc_put(priv->mcdi->r5_rproc);
(6) kfree(priv->mcdi); // frees outer struct
Why the concern doesn't hold
The reviewer's premise is wrong. The comment says "after priv->mcdi
is freed" — but priv->mcdi (the struct cdx_mcdi * pointer passed to
cdx_mcdi_process_cmd) is not freed
until step 6, well after unregister_rpmsg_driver() at step 3.
What IS freed at step 2 is the inner cdx->mcdi (struct cdx_mcdi_data
*). But cdx_mcdi_finish() also sets cdx->mcdi = NULL. If rpmsg_cb()
fires between steps 2 and 3:
1. mc_priv->mcdi → still points to valid struct cdx_mcdi (not freed
until step 6)
2. cdx_mcdi_process_cmd(mc_priv->mcdi, ...) is called with a valid
pointer
3. Inside, cdx_mcdi_if(cdx) checks cdx->mcdi → NULL → returns NULL
4. if (!mcdi) return; → exits early, safe
Additionally, cdx_mcdi_wait_for_cleanup() inside cdx_mcdi_finish()
ensures all pending MCDI commands have completed before freeing, so no
legitimate MCDI responses should
arrive afterward.
This ordering exactly matches the probe error unwind path (err_init:
does cdx_mcdi_finish → kfree → unregister_rpmsg_driver → rproc_shutdown
→ rproc_put), which the
commit explicitly states it's aligning with.
Verdict: False positive. The NULL guard in cdx_mcdi_if() /
cdx_mcdi_process_cmd() protects the window, and the outer struct
priv->mcdi remains valid throughout.
------------------------------------------------------------
So the comment looks like a false positive. It will be great if someone
from AMD verifies this.
Can you please look at the other 2 patches in this series? They are
independent of this change.
Thanks,
Prasanna Kumar
next prev parent reply other threads:[~2026-04-06 5:26 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-01 11:18 [PATCH v2 1/3] EDAC/versalnet: Fix teardown ordering in mc_remove() Prasanna Kumar T S M
2026-04-01 11:18 ` [PATCH v2 2/3] EDAC/versalnet: Fix device name memory leak Prasanna Kumar T S M
2026-04-01 11:19 ` [PATCH v2 3/3] EDAC/versalnet: Fix device_register() error handling in init_one_mc() Prasanna Kumar T S M
2026-04-03 10:34 ` [PATCH v2 1/3] EDAC/versalnet: Fix teardown ordering in mc_remove() Borislav Petkov
2026-04-06 5:26 ` Prasanna Kumar T S M [this message]
2026-04-06 8:23 ` Borislav Petkov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b6af31b0-0429-4092-8ec0-e0ab84a16f39@linux.microsoft.com \
--to=ptsm@linux.microsoft.com \
--cc=bp@alien8.de \
--cc=linux-edac@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=shubhrajyoti.datta@amd.com \
--cc=ssengar@linux.microsoft.com \
--cc=tony.luck@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox