From: Jakub Kicinski <kuba@kernel.org>
To: nikhil.rao@amd.com
Cc: Jakub Kicinski <kuba@kernel.org>,
brett.creeley@amd.com, andrew+netdev@lunn.ch,
davem@davemloft.net, edumazet@google.com, pabeni@redhat.com,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
eric.joyner@amd.com, Vamsi.Atluri@amd.com
Subject: Re: [PATCH net-next v2 3/6] pds_core: add PLDM firmware update support via devlink flash
Date: Wed, 20 May 2026 16:44:32 -0700 [thread overview]
Message-ID: <20260520234432.567056-1-kuba@kernel.org> (raw)
In-Reply-To: <20260516-upstream_v2_clean-v2-3-7e0d66bf4020@amd.com>
This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
---
pds_core: add PLDM firmware update support via devlink flash
This adds PLDM firmware update support to pds_core via the upstream
pldmfw API, allowing flashing of entire packages or individual
components through devlink dev flash.
> diff --git a/drivers/net/ethernet/amd/pds_core/fw.c b/drivers/net/ethernet/amd/pds_core/fw.c
> index fa626719e68d..f091a753bce9 100644
> --- a/drivers/net/ethernet/amd/pds_core/fw.c
> +++ b/drivers/net/ethernet/amd/pds_core/fw.c
[ ... ]
> +int pdsc_get_component_info(struct pdsc *pdsc)
> +{
> + union pds_core_dev_cmd cmd = {
> + .get_component_info.opcode = PDS_CORE_CMD_GET_COMPONENT_INFO,
> + .get_component_info.ver = 1,
> + };
> + struct pds_core_component_list_info *list_info;
> + union pds_core_dev_comp comp = {};
> + dma_addr_t dma_addr;
> + u8 num_components;
> + int err, i;
> +
> + list_info = kzalloc(PDS_PAGE_SIZE, GFP_KERNEL);
> + if (!list_info)
> + return -ENOMEM;
> +
> + dma_addr = dma_map_single(pdsc->dev, list_info, PDS_PAGE_SIZE,
> + DMA_FROM_DEVICE);
> + if (dma_mapping_error(pdsc->dev, dma_addr)) {
> + dev_err(pdsc->dev,
> + "Failed to dma_map component_list_info length %d\n",
> + PDS_PAGE_SIZE);
> + err = -ENOMEM;
> + goto out;
> + }
> +
> + cmd.get_component_info.data_len = cpu_to_le16(PDS_PAGE_SIZE);
> + cmd.get_component_info.data_pa = cpu_to_le64(dma_addr);
> +
> + err = pdsc_devcmd(pdsc, &cmd, &comp, pdsc->devcmd_timeout * 2);
> + dma_unmap_single(pdsc->dev, dma_addr, PDS_PAGE_SIZE, DMA_FROM_DEVICE);
> + if (err)
> + goto out;
> +
> + if (comp.get_component_info.ver == 0) {
> + /* Don't support backward compatibility as version 0 has
> + * alignment issues, so give a hint to users to update
> + * their firmware
> + */
> + dev_warn(pdsc->dev,
> + "Incompatible get_component_info version %u reported by firmware\n",
> + comp.get_component_info.ver);
> + err = 0;
> + goto out;
> + }
[Medium]
The comment states version 0 is not supported because of alignment
issues, but the code sets err = 0 and falls through with
pdsc->fw_components.num_components left at zero.
The caller pdsc_pldm_firmware_update() treats err == 0 as success and
calls pldmfw_flash_image() with no cached components, which leads to
a few user-visible side effects:
- When a user passes a specific component, pdsc_get_slot_id() iterates
an empty fw_components array and returns PDS_CORE_FW_SLOT_MAX, which
bubbles up as -ENOENT from pdsc_send_component_table() rather than a
"firmware too old" indication.
- For a full-image flash, pdsc_set_component_name() finds no matches,
so devlink_flash_update_status_notify() ends up reporting empty
component strings throughout the flash.
- pdsc_pldm_firmware_update() resets num_components = 0 at the end of
every flash, so the dev_warn and the same downstream paths recur on
every subsequent attempt.
Should this path return an error such as -EOPNOTSUPP so the documented
intent matches what callers see?
[ ... ]
> +static int pdsc_devcmd_send_component(struct pdsc *pdsc,
> + struct pds_core_flash_component *info,
> + u16 info_sz, dma_addr_t addr, u32 length,
> + u32 offset, u16 slot_id,
> + union pds_core_dev_comp *comp)
> +{
> + union pds_core_dev_cmd cmd = {
> + .send_component.opcode = PDS_CORE_CMD_SEND_COMPONENT,
> + .send_component.ver = 1,
> + .send_component.operation = PDS_CORE_SEND_COMPONENT_START,
> + .send_component.data_pa = cpu_to_le64(addr),
> + .send_component.data_len = cpu_to_le32(length),
> + .send_component.offset = cpu_to_le32(offset),
> + .send_component.slot_id = slot_id,
> + };
> + unsigned long timeout = 300 * HZ;
> + unsigned long start_time;
> + unsigned long end_time;
> + int err;
> +
> + start_time = jiffies;
> + end_time = start_time + timeout;
> + do {
> + /* prevent noisy/benign devcmd failures */
> + err = pdsc_devcmd_with_data_nomsg(pdsc, &cmd, info, info_sz,
> + comp, 60);
> + if (err != -EAGAIN)
> + break;
> +
> + /* if required, subsequent commands check status of
> + * PDS_CORE_CMD_SEND_COMPONENT command, which returns
> + * EAGAIN while the command is still running,
> + * else we get the final command status.
> + */
> + cmd.send_component.operation = PDS_CORE_SEND_COMPONENT_STATUS;
> + msleep(20);
> + } while (time_before(jiffies, end_time));
> +
> + if (err == -EAGAIN || err == -ETIMEDOUT)
> + dev_err(pdsc->dev, "PDS_CORE_CMD_SEND_COMPONENT timed out\n");
> +
> + return err;
> +}
[ ... ]
> +static int pdsc_flash_component(struct pldmfw *context,
> + struct pldmfw_component *component)
> +{
[ ... ]
> + err = pdsc_devcmd_send_component(pdsc, component_info, info_sz,
> + dma_addr, copy_sz, offset,
> + slot_id, &comp);
> + dma_unmap_single(dev, dma_addr, copy_sz, DMA_TO_DEVICE);
> + kfree(component_data);
[High]
On the -EAGAIN or -ETIMEDOUT exit from pdsc_devcmd_send_component(),
can the device still be DMA-reading from this buffer when the unmap
and kfree happen here?
The polling loop in pdsc_devcmd_send_component() runs up to 300 seconds
and the only exits when the device is still busy are -EAGAIN (the
status query explicitly says START has not completed) or -ETIMEDOUT.
No abort or cancel opcode is issued, and the wire format documented
for struct pds_core_send_component_cmd hands data_pa to the device on
the START call.
Looking at __pdsc_devcmd_locked() in pds_core/dev.c:
if ((err == -ENXIO || err == -ETIMEDOUT) && pdsc->wq)
queue_work(pdsc->wq, &pdsc->health_work);
-EAGAIN does not even trigger health_work, so the device is not
treated as hung in that case.
Once control returns to pdsc_flash_component():
err = pdsc_devcmd_send_component(...);
dma_unmap_single(dev, dma_addr, copy_sz, DMA_TO_DEVICE);
kfree(component_data);
If the firmware finishes the flash a moment later and reads from
data_pa, on IOMMU systems the unmap can race the DMA and produce
IOMMU faults; on non-IOMMU systems the device can DMA-read from
freed slab memory belonging to a later allocation.
Should this path leak the buffer, reset the device, or send an abort
opcode to the firmware before unmapping and freeing on the busy
timeout exits?
[ ... ]
next prev parent reply other threads:[~2026-05-20 23:44 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-29 8:28 [PATCH net-next 0/6] pds_core: Add PLDM firmware update and host backed memory support Nikhil P. Rao
2026-04-29 8:28 ` [PATCH net-next 1/6] pds_core: add support for quiet devcmd failures Nikhil P. Rao
2026-04-29 8:28 ` [PATCH net-next 2/6] pds_core: add support for identity version 2 Nikhil P. Rao
2026-04-29 8:28 ` [PATCH net-next 3/6] pds_core: add PLDM firmware update support via devlink flash Nikhil P. Rao
2026-05-01 1:05 ` Jakub Kicinski
2026-05-01 20:03 ` Rao, Nikhil
2026-04-29 8:28 ` [PATCH net-next 4/6] pds_core: add PLDM component info display Nikhil P. Rao
2026-04-29 8:28 ` [PATCH net-next 5/6] pds_core: add host backed memory support for firmware Nikhil P. Rao
2026-04-29 8:28 ` [PATCH net-next 6/6] pds_core: add debugfs support for host backed memory Nikhil P. Rao
2026-05-16 2:42 ` [PATCH net-next v2 0/6] PLDM Firmware Update Support for pds_core Nikhil P. Rao
2026-05-16 2:42 ` [PATCH net-next v2 1/6] pds_core: add support for quiet devcmd failures Nikhil P. Rao
2026-05-16 2:42 ` [PATCH net-next v2 2/6] pds_core: add support for identity version 2 Nikhil P. Rao
2026-05-20 23:44 ` Jakub Kicinski
2026-05-16 2:42 ` [PATCH net-next v2 3/6] pds_core: add PLDM firmware update support via devlink flash Nikhil P. Rao
2026-05-20 23:44 ` Jakub Kicinski [this message]
2026-05-16 2:42 ` [PATCH net-next v2 4/6] pds_core: add PLDM component info display Nikhil P. Rao
2026-05-20 23:44 ` Jakub Kicinski
2026-05-20 23:47 ` Jakub Kicinski
2026-05-16 2:42 ` [PATCH net-next v2 5/6] pds_core: add host backed memory support for firmware Nikhil P. Rao
2026-05-20 23:44 ` Jakub Kicinski
2026-05-16 2:42 ` [PATCH net-next v2 6/6] pds_core: add debugfs support for host backed memory Nikhil P. Rao
2026-05-20 23:44 ` Jakub Kicinski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260520234432.567056-1-kuba@kernel.org \
--to=kuba@kernel.org \
--cc=Vamsi.Atluri@amd.com \
--cc=andrew+netdev@lunn.ch \
--cc=brett.creeley@amd.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=eric.joyner@amd.com \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=nikhil.rao@amd.com \
--cc=pabeni@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox