public inbox for linux-edac@vger.kernel.org
 help / color / mirror / Atom feed
From: Prasanna Kumar T S M <ptsm@linux.microsoft.com>
To: Borislav Petkov <bp@alien8.de>
Cc: ssengar@linux.microsoft.com, shubhrajyoti.datta@amd.com,
	tony.luck@intel.com, linux-edac@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 1/3] EDAC/versalnet: Fix teardown ordering in mc_remove()
Date: Mon, 6 Apr 2026 10:56:17 +0530	[thread overview]
Message-ID: <b6af31b0-0429-4092-8ec0-e0ab84a16f39@linux.microsoft.com> (raw)
In-Reply-To: <20260403103457.GAac-X0SzsO9MtwAVE@fat_crate.local>



On 03-04-2026 16:04, Borislav Petkov wrote:
> On Wed, Apr 01, 2026 at 04:18:36AM -0700, Prasanna Kumar T S M wrote:
>> The teardown sequence in mc_remove() does not mirror the reverse of the
>> initialization order in mc_probe(). In particular,
>> unregister_rpmsg_driver() is called before remove_versalnet(), and
>> cdx_mcdi_finish() is called after rproc_shutdown().
>>
>> Reorder mc_remove() to reverse the probe initialization sequence,
>> consistent with the probe error-unwind paths.
>>
>> The rproc reference acquired via rproc_get_by_phandle() during probe
>> is not released in mc_remove(), causing a reference count leak. Add
>> the missing rproc_put() call.
>>
>> Fixes: d5fe2fec6c40 ("EDAC: Add a driver for the AMD Versal NET DDR controller")
>> Signed-off-by: Prasanna Kumar T S M <ptsm@linux.microsoft.com>
>> ---
>>   drivers/edac/versalnet_edac.c | 5 +++--
>>   1 file changed, 3 insertions(+), 2 deletions(-)
> 
> Sashiko has found things, pls addres them:
> 
> https://sashiko.dev/#/patchset/20260401111836.2342918-1-ptsm%40linux.microsoft.com
> 

I asked AI to validate Sashiko's comment. This is its output
------------------------------------------------------------

   Analysis

   The review comment is not valid (false positive). Here's the detailed 
reasoning:

   Data structure hierarchy

    mc_priv
      └── mcdi: struct cdx_mcdi *            ← kfree'd at step 6 (LAST)
            ├── mcdi: struct cdx_mcdi_data *  ← kfree'd by 
cdx_mcdi_finish() at step 2
            │     └── iface: struct cdx_mcdi_iface (mutex, cmd list, etc.)
            ├── r5_rproc
            └── ...

   New mc_remove() order (after commit)

    (1) remove_versalnet(priv);                  // tear down EDAC
    (2) cdx_mcdi_finish(priv->mcdi);             // wait_for_cleanup → 
kfree(cdx->mcdi) → cdx->mcdi = NULL
    (3) unregister_rpmsg_driver(&amd_rpmsg_driver); // destroys rpmsg 
endpoint
    (4) rproc_shutdown(priv->mcdi->r5_rproc);
    (5) rproc_put(priv->mcdi->r5_rproc);
    (6) kfree(priv->mcdi);                       // frees outer struct

   Why the concern doesn't hold

   The reviewer's premise is wrong. The comment says "after priv->mcdi 
is freed" — but priv->mcdi (the struct cdx_mcdi * pointer passed to 
cdx_mcdi_process_cmd) is not freed
   until step 6, well after unregister_rpmsg_driver() at step 3.

   What IS freed at step 2 is the inner cdx->mcdi (struct cdx_mcdi_data 
*). But cdx_mcdi_finish() also sets cdx->mcdi = NULL. If rpmsg_cb() 
fires between steps 2 and 3:

    1. mc_priv->mcdi → still points to valid struct cdx_mcdi (not freed 
until step 6)
    2. cdx_mcdi_process_cmd(mc_priv->mcdi, ...) is called with a valid 
pointer
    3. Inside, cdx_mcdi_if(cdx) checks cdx->mcdi → NULL → returns NULL
    4. if (!mcdi) return; → exits early, safe

   Additionally, cdx_mcdi_wait_for_cleanup() inside cdx_mcdi_finish() 
ensures all pending MCDI commands have completed before freeing, so no 
legitimate MCDI responses should
   arrive afterward.

   This ordering exactly matches the probe error unwind path (err_init: 
does cdx_mcdi_finish → kfree → unregister_rpmsg_driver → rproc_shutdown 
→ rproc_put), which the
   commit explicitly states it's aligning with.

   Verdict: False positive. The NULL guard in cdx_mcdi_if() / 
cdx_mcdi_process_cmd() protects the window, and the outer struct 
priv->mcdi remains valid throughout.

------------------------------------------------------------

So the comment looks like a false positive. It will be great if someone 
from AMD verifies this.

Can you please look at the other 2 patches in this series? They are 
independent of this change.

Thanks,
Prasanna Kumar

  reply	other threads:[~2026-04-06  5:26 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-01 11:18 [PATCH v2 1/3] EDAC/versalnet: Fix teardown ordering in mc_remove() Prasanna Kumar T S M
2026-04-01 11:18 ` [PATCH v2 2/3] EDAC/versalnet: Fix device name memory leak Prasanna Kumar T S M
2026-04-01 11:19 ` [PATCH v2 3/3] EDAC/versalnet: Fix device_register() error handling in init_one_mc() Prasanna Kumar T S M
2026-04-03 10:34 ` [PATCH v2 1/3] EDAC/versalnet: Fix teardown ordering in mc_remove() Borislav Petkov
2026-04-06  5:26   ` Prasanna Kumar T S M [this message]
2026-04-06  8:23     ` Borislav Petkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b6af31b0-0429-4092-8ec0-e0ab84a16f39@linux.microsoft.com \
    --to=ptsm@linux.microsoft.com \
    --cc=bp@alien8.de \
    --cc=linux-edac@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=shubhrajyoti.datta@amd.com \
    --cc=ssengar@linux.microsoft.com \
    --cc=tony.luck@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox