* [PATCH v2 1/3] EDAC/versalnet: Fix teardown ordering in mc_remove()
@ 2026-04-01 11:18 Prasanna Kumar T S M
2026-04-01 11:18 ` [PATCH v2 2/3] EDAC/versalnet: Fix device name memory leak Prasanna Kumar T S M
` (2 more replies)
0 siblings, 3 replies; 6+ messages in thread
From: Prasanna Kumar T S M @ 2026-04-01 11:18 UTC (permalink / raw)
To: ptsm, ssengar, shubhrajyoti.datta, bp, tony.luck, linux-edac,
linux-kernel
The teardown sequence in mc_remove() does not mirror the reverse of the
initialization order in mc_probe(). In particular,
unregister_rpmsg_driver() is called before remove_versalnet(), and
cdx_mcdi_finish() is called after rproc_shutdown().
Reorder mc_remove() to reverse the probe initialization sequence,
consistent with the probe error-unwind paths.
The rproc reference acquired via rproc_get_by_phandle() during probe
is not released in mc_remove(), causing a reference count leak. Add
the missing rproc_put() call.
Fixes: d5fe2fec6c40 ("EDAC: Add a driver for the AMD Versal NET DDR controller")
Signed-off-by: Prasanna Kumar T S M <ptsm@linux.microsoft.com>
---
drivers/edac/versalnet_edac.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/edac/versalnet_edac.c b/drivers/edac/versalnet_edac.c
index b87fe57aa842..acd51b492772 100644
--- a/drivers/edac/versalnet_edac.c
+++ b/drivers/edac/versalnet_edac.c
@@ -955,10 +955,11 @@ static void mc_remove(struct platform_device *pdev)
{
struct mc_priv *priv = platform_get_drvdata(pdev);
- unregister_rpmsg_driver(&amd_rpmsg_driver);
remove_versalnet(priv);
- rproc_shutdown(priv->mcdi->r5_rproc);
cdx_mcdi_finish(priv->mcdi);
+ unregister_rpmsg_driver(&amd_rpmsg_driver);
+ rproc_shutdown(priv->mcdi->r5_rproc);
+ rproc_put(priv->mcdi->r5_rproc);
kfree(priv->mcdi);
}
--
2.49.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH v2 2/3] EDAC/versalnet: Fix device name memory leak
2026-04-01 11:18 [PATCH v2 1/3] EDAC/versalnet: Fix teardown ordering in mc_remove() Prasanna Kumar T S M
@ 2026-04-01 11:18 ` Prasanna Kumar T S M
2026-04-01 11:19 ` [PATCH v2 3/3] EDAC/versalnet: Fix device_register() error handling in init_one_mc() Prasanna Kumar T S M
2026-04-03 10:34 ` [PATCH v2 1/3] EDAC/versalnet: Fix teardown ordering in mc_remove() Borislav Petkov
2 siblings, 0 replies; 6+ messages in thread
From: Prasanna Kumar T S M @ 2026-04-01 11:18 UTC (permalink / raw)
To: ptsm, ssengar, shubhrajyoti.datta, bp, tony.luck, linux-edac,
linux-kernel
The device name allocated via kzalloc() in init_one_mc() is assigned to
dev->init_name but never freed on the normal removal path.
device_register() copies init_name and then sets dev->init_name to NULL,
so the name pointer becomes unreachable from the device. Thus leaking
memory.
Use a stack-local char array instead of using kzalloc() for name.
Fixes: d5fe2fec6c40 ("EDAC: Add a driver for the AMD Versal NET DDR controller")
Cc: stable@vger.kernel.org
Signed-off-by: Prasanna Kumar T S M <ptsm@linux.microsoft.com>
---
drivers/edac/versalnet_edac.c | 10 ++--------
1 file changed, 2 insertions(+), 8 deletions(-)
diff --git a/drivers/edac/versalnet_edac.c b/drivers/edac/versalnet_edac.c
index acd51b492772..012c4f40994d 100644
--- a/drivers/edac/versalnet_edac.c
+++ b/drivers/edac/versalnet_edac.c
@@ -779,7 +779,7 @@ static int init_one_mc(struct mc_priv *priv, struct platform_device *pdev, int i
struct mem_ctl_info *mci;
struct device *dev;
enum dev_type dt;
- char *name;
+ char name[MC_NAME_LEN];
int rc;
config = priv->adec[CONF + i * ADEC_NUM];
@@ -813,13 +813,9 @@ static int init_one_mc(struct mc_priv *priv, struct platform_device *pdev, int i
layers[1].is_virt_csrow = false;
rc = -ENOMEM;
- name = kzalloc(MC_NAME_LEN, GFP_KERNEL);
- if (!name)
- return rc;
-
dev = kzalloc(sizeof(*dev), GFP_KERNEL);
if (!dev)
- goto err_name_free;
+ return rc;
mci = edac_mc_alloc(i, ARRAY_SIZE(layers), layers, sizeof(struct mc_priv));
if (!mci) {
@@ -858,8 +854,6 @@ static int init_one_mc(struct mc_priv *priv, struct platform_device *pdev, int i
edac_mc_free(mci);
err_dev_free:
kfree(dev);
-err_name_free:
- kfree(name);
return rc;
}
--
2.49.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH v2 3/3] EDAC/versalnet: Fix device_register() error handling in init_one_mc()
2026-04-01 11:18 [PATCH v2 1/3] EDAC/versalnet: Fix teardown ordering in mc_remove() Prasanna Kumar T S M
2026-04-01 11:18 ` [PATCH v2 2/3] EDAC/versalnet: Fix device name memory leak Prasanna Kumar T S M
@ 2026-04-01 11:19 ` Prasanna Kumar T S M
2026-04-03 10:34 ` [PATCH v2 1/3] EDAC/versalnet: Fix teardown ordering in mc_remove() Borislav Petkov
2 siblings, 0 replies; 6+ messages in thread
From: Prasanna Kumar T S M @ 2026-04-01 11:19 UTC (permalink / raw)
To: ptsm, ssengar, shubhrajyoti.datta, bp, tony.luck, linux-edac,
linux-kernel
When device_register() fails, it must be followed by put_device()
rather than kfree(), because device_register() calls
device_initialize() which sets up the device refcount. The matching
release function versal_edac_release() handles the actual kfree().
To simplify error handling and avoid complex unwinding, split
device_register() into device_initialize() and device_add().
Initialize the device early so put_device() can be used in all
error paths.
Fixes: d5fe2fec6c40 ("EDAC: Add a driver for the AMD Versal NET DDR controller")
Cc: stable@vger.kernel.org
Signed-off-by: Prasanna Kumar T S M <ptsm@linux.microsoft.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
---
drivers/edac/versalnet_edac.c | 20 ++++++++++----------
1 file changed, 10 insertions(+), 10 deletions(-)
diff --git a/drivers/edac/versalnet_edac.c b/drivers/edac/versalnet_edac.c
index 012c4f40994d..94580d3c3170 100644
--- a/drivers/edac/versalnet_edac.c
+++ b/drivers/edac/versalnet_edac.c
@@ -780,7 +780,7 @@ static int init_one_mc(struct mc_priv *priv, struct platform_device *pdev, int i
struct device *dev;
enum dev_type dt;
char name[MC_NAME_LEN];
- int rc;
+ int rc = -ENOMEM;
config = priv->adec[CONF + i * ADEC_NUM];
num_chans = FIELD_GET(MC5_NUM_CHANS_MASK, config);
@@ -812,23 +812,23 @@ static int init_one_mc(struct mc_priv *priv, struct platform_device *pdev, int i
layers[1].size = num_chans;
layers[1].is_virt_csrow = false;
- rc = -ENOMEM;
dev = kzalloc(sizeof(*dev), GFP_KERNEL);
if (!dev)
return rc;
+ sprintf(name, "versal-net-ddrmc5-edac-%d", i);
+
+ dev->init_name = name;
+ dev->release = versal_edac_release;
+ device_initialize(dev);
+
mci = edac_mc_alloc(i, ARRAY_SIZE(layers), layers, sizeof(struct mc_priv));
if (!mci) {
edac_printk(KERN_ERR, EDAC_MC, "Failed memory allocation for MC%d\n", i);
goto err_dev_free;
}
- sprintf(name, "versal-net-ddrmc5-edac-%d", i);
-
- dev->init_name = name;
- dev->release = versal_edac_release;
-
- rc = device_register(dev);
+ rc = device_add(dev);
if (rc)
goto err_mc_free;
@@ -849,11 +849,11 @@ static int init_one_mc(struct mc_priv *priv, struct platform_device *pdev, int i
return 0;
err_unreg:
- device_unregister(mci->pdev);
+ device_del(dev);
err_mc_free:
edac_mc_free(mci);
err_dev_free:
- kfree(dev);
+ put_device(dev);
return rc;
}
--
2.49.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH v2 1/3] EDAC/versalnet: Fix teardown ordering in mc_remove()
2026-04-01 11:18 [PATCH v2 1/3] EDAC/versalnet: Fix teardown ordering in mc_remove() Prasanna Kumar T S M
2026-04-01 11:18 ` [PATCH v2 2/3] EDAC/versalnet: Fix device name memory leak Prasanna Kumar T S M
2026-04-01 11:19 ` [PATCH v2 3/3] EDAC/versalnet: Fix device_register() error handling in init_one_mc() Prasanna Kumar T S M
@ 2026-04-03 10:34 ` Borislav Petkov
2026-04-06 5:26 ` Prasanna Kumar T S M
2 siblings, 1 reply; 6+ messages in thread
From: Borislav Petkov @ 2026-04-03 10:34 UTC (permalink / raw)
To: Prasanna Kumar T S M
Cc: ssengar, shubhrajyoti.datta, tony.luck, linux-edac, linux-kernel
On Wed, Apr 01, 2026 at 04:18:36AM -0700, Prasanna Kumar T S M wrote:
> The teardown sequence in mc_remove() does not mirror the reverse of the
> initialization order in mc_probe(). In particular,
> unregister_rpmsg_driver() is called before remove_versalnet(), and
> cdx_mcdi_finish() is called after rproc_shutdown().
>
> Reorder mc_remove() to reverse the probe initialization sequence,
> consistent with the probe error-unwind paths.
>
> The rproc reference acquired via rproc_get_by_phandle() during probe
> is not released in mc_remove(), causing a reference count leak. Add
> the missing rproc_put() call.
>
> Fixes: d5fe2fec6c40 ("EDAC: Add a driver for the AMD Versal NET DDR controller")
> Signed-off-by: Prasanna Kumar T S M <ptsm@linux.microsoft.com>
> ---
> drivers/edac/versalnet_edac.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
Sashiko has found things, pls addres them:
https://sashiko.dev/#/patchset/20260401111836.2342918-1-ptsm%40linux.microsoft.com
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2 1/3] EDAC/versalnet: Fix teardown ordering in mc_remove()
2026-04-03 10:34 ` [PATCH v2 1/3] EDAC/versalnet: Fix teardown ordering in mc_remove() Borislav Petkov
@ 2026-04-06 5:26 ` Prasanna Kumar T S M
2026-04-06 8:23 ` Borislav Petkov
0 siblings, 1 reply; 6+ messages in thread
From: Prasanna Kumar T S M @ 2026-04-06 5:26 UTC (permalink / raw)
To: Borislav Petkov
Cc: ssengar, shubhrajyoti.datta, tony.luck, linux-edac, linux-kernel
On 03-04-2026 16:04, Borislav Petkov wrote:
> On Wed, Apr 01, 2026 at 04:18:36AM -0700, Prasanna Kumar T S M wrote:
>> The teardown sequence in mc_remove() does not mirror the reverse of the
>> initialization order in mc_probe(). In particular,
>> unregister_rpmsg_driver() is called before remove_versalnet(), and
>> cdx_mcdi_finish() is called after rproc_shutdown().
>>
>> Reorder mc_remove() to reverse the probe initialization sequence,
>> consistent with the probe error-unwind paths.
>>
>> The rproc reference acquired via rproc_get_by_phandle() during probe
>> is not released in mc_remove(), causing a reference count leak. Add
>> the missing rproc_put() call.
>>
>> Fixes: d5fe2fec6c40 ("EDAC: Add a driver for the AMD Versal NET DDR controller")
>> Signed-off-by: Prasanna Kumar T S M <ptsm@linux.microsoft.com>
>> ---
>> drivers/edac/versalnet_edac.c | 5 +++--
>> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> Sashiko has found things, pls addres them:
>
> https://sashiko.dev/#/patchset/20260401111836.2342918-1-ptsm%40linux.microsoft.com
>
I asked AI to validate Sashiko's comment. This is its output
------------------------------------------------------------
Analysis
The review comment is not valid (false positive). Here's the detailed
reasoning:
Data structure hierarchy
mc_priv
└── mcdi: struct cdx_mcdi * ← kfree'd at step 6 (LAST)
├── mcdi: struct cdx_mcdi_data * ← kfree'd by
cdx_mcdi_finish() at step 2
│ └── iface: struct cdx_mcdi_iface (mutex, cmd list, etc.)
├── r5_rproc
└── ...
New mc_remove() order (after commit)
(1) remove_versalnet(priv); // tear down EDAC
(2) cdx_mcdi_finish(priv->mcdi); // wait_for_cleanup →
kfree(cdx->mcdi) → cdx->mcdi = NULL
(3) unregister_rpmsg_driver(&amd_rpmsg_driver); // destroys rpmsg
endpoint
(4) rproc_shutdown(priv->mcdi->r5_rproc);
(5) rproc_put(priv->mcdi->r5_rproc);
(6) kfree(priv->mcdi); // frees outer struct
Why the concern doesn't hold
The reviewer's premise is wrong. The comment says "after priv->mcdi
is freed" — but priv->mcdi (the struct cdx_mcdi * pointer passed to
cdx_mcdi_process_cmd) is not freed
until step 6, well after unregister_rpmsg_driver() at step 3.
What IS freed at step 2 is the inner cdx->mcdi (struct cdx_mcdi_data
*). But cdx_mcdi_finish() also sets cdx->mcdi = NULL. If rpmsg_cb()
fires between steps 2 and 3:
1. mc_priv->mcdi → still points to valid struct cdx_mcdi (not freed
until step 6)
2. cdx_mcdi_process_cmd(mc_priv->mcdi, ...) is called with a valid
pointer
3. Inside, cdx_mcdi_if(cdx) checks cdx->mcdi → NULL → returns NULL
4. if (!mcdi) return; → exits early, safe
Additionally, cdx_mcdi_wait_for_cleanup() inside cdx_mcdi_finish()
ensures all pending MCDI commands have completed before freeing, so no
legitimate MCDI responses should
arrive afterward.
This ordering exactly matches the probe error unwind path (err_init:
does cdx_mcdi_finish → kfree → unregister_rpmsg_driver → rproc_shutdown
→ rproc_put), which the
commit explicitly states it's aligning with.
Verdict: False positive. The NULL guard in cdx_mcdi_if() /
cdx_mcdi_process_cmd() protects the window, and the outer struct
priv->mcdi remains valid throughout.
------------------------------------------------------------
So the comment looks like a false positive. It will be great if someone
from AMD verifies this.
Can you please look at the other 2 patches in this series? They are
independent of this change.
Thanks,
Prasanna Kumar
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2 1/3] EDAC/versalnet: Fix teardown ordering in mc_remove()
2026-04-06 5:26 ` Prasanna Kumar T S M
@ 2026-04-06 8:23 ` Borislav Petkov
0 siblings, 0 replies; 6+ messages in thread
From: Borislav Petkov @ 2026-04-06 8:23 UTC (permalink / raw)
To: Prasanna Kumar T S M
Cc: ssengar, shubhrajyoti.datta, tony.luck, linux-edac, linux-kernel
On Mon, Apr 06, 2026 at 10:56:17AM +0530, Prasanna Kumar T S M wrote:
> > Sashiko has found things, pls addres them:
> >
> > https://sashiko.dev/#/patchset/20260401111836.2342918-1-ptsm%40linux.microsoft.com
> >
>
> I asked AI to validate Sashiko's comment.
So I asked *you* to address the Sashiko review. You go and ask another AI to
validate the review.
Now, if I go and paste the whole conversation to a third AI, the f*ckup is
complete.
Tell me: is that the goal of this exercise? Let AIs do the thinking
for us and we can all go shopping?
Or maybe *you* should go review Sashiko's review and say, I agree because
<proper, comprehensible explanation> or I don't agree because <also proper,
comprehensible explanation>.
Then *I* go and verify that.
Hmm...
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2026-04-06 8:24 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-01 11:18 [PATCH v2 1/3] EDAC/versalnet: Fix teardown ordering in mc_remove() Prasanna Kumar T S M
2026-04-01 11:18 ` [PATCH v2 2/3] EDAC/versalnet: Fix device name memory leak Prasanna Kumar T S M
2026-04-01 11:19 ` [PATCH v2 3/3] EDAC/versalnet: Fix device_register() error handling in init_one_mc() Prasanna Kumar T S M
2026-04-03 10:34 ` [PATCH v2 1/3] EDAC/versalnet: Fix teardown ordering in mc_remove() Borislav Petkov
2026-04-06 5:26 ` Prasanna Kumar T S M
2026-04-06 8:23 ` Borislav Petkov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox