From: Jakub Kicinski <kuba@kernel.org>
To: nikhil.rao@amd.com
Cc: Jakub Kicinski <kuba@kernel.org>,
brett.creeley@amd.com, andrew+netdev@lunn.ch,
davem@davemloft.net, edumazet@google.com, pabeni@redhat.com,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
eric.joyner@amd.com, Vamsi.Atluri@amd.com
Subject: Re: [PATCH net-next v2 3/6] pds_core: add PLDM firmware update support via devlink flash
Date: Wed, 20 May 2026 16:44:32 -0700 [thread overview]
Message-ID: <20260520234432.567056-1-kuba@kernel.org> (raw)
In-Reply-To: <20260516-upstream_v2_clean-v2-3-7e0d66bf4020@amd.com>
This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
---
pds_core: add PLDM firmware update support via devlink flash
This adds PLDM firmware update support to pds_core via the upstream
pldmfw API, allowing flashing of entire packages or individual
components through devlink dev flash.
> diff --git a/drivers/net/ethernet/amd/pds_core/fw.c b/drivers/net/ethernet/amd/pds_core/fw.c
> index fa626719e68d..f091a753bce9 100644
> --- a/drivers/net/ethernet/amd/pds_core/fw.c
> +++ b/drivers/net/ethernet/amd/pds_core/fw.c
[ ... ]
> +int pdsc_get_component_info(struct pdsc *pdsc)
> +{
> + union pds_core_dev_cmd cmd = {
> + .get_component_info.opcode = PDS_CORE_CMD_GET_COMPONENT_INFO,
> + .get_component_info.ver = 1,
> + };
> + struct pds_core_component_list_info *list_info;
> + union pds_core_dev_comp comp = {};
> + dma_addr_t dma_addr;
> + u8 num_components;
> + int err, i;
> +
> + list_info = kzalloc(PDS_PAGE_SIZE, GFP_KERNEL);
> + if (!list_info)
> + return -ENOMEM;
> +
> + dma_addr = dma_map_single(pdsc->dev, list_info, PDS_PAGE_SIZE,
> + DMA_FROM_DEVICE);
> + if (dma_mapping_error(pdsc->dev, dma_addr)) {
> + dev_err(pdsc->dev,
> + "Failed to dma_map component_list_info length %d\n",
> + PDS_PAGE_SIZE);
> + err = -ENOMEM;
> + goto out;
> + }
> +
> + cmd.get_component_info.data_len = cpu_to_le16(PDS_PAGE_SIZE);
> + cmd.get_component_info.data_pa = cpu_to_le64(dma_addr);
> +
> + err = pdsc_devcmd(pdsc, &cmd, &comp, pdsc->devcmd_timeout * 2);
> + dma_unmap_single(pdsc->dev, dma_addr, PDS_PAGE_SIZE, DMA_FROM_DEVICE);
> + if (err)
> + goto out;
> +
> + if (comp.get_component_info.ver == 0) {
> + /* Don't support backward compatibility as version 0 has
> + * alignment issues, so give a hint to users to update
> + * their firmware
> + */
> + dev_warn(pdsc->dev,
> + "Incompatible get_component_info version %u reported by firmware\n",
> + comp.get_component_info.ver);
> + err = 0;
> + goto out;
> + }
[Medium]
The comment states version 0 is not supported because of alignment
issues, but the code sets err = 0 and falls through with
pdsc->fw_components.num_components left at zero.
The caller pdsc_pldm_firmware_update() treats err == 0 as success and
calls pldmfw_flash_image() with no cached components, which leads to
a few user-visible side effects:
- When a user passes a specific component, pdsc_get_slot_id() iterates
an empty fw_components array and returns PDS_CORE_FW_SLOT_MAX, which
bubbles up as -ENOENT from pdsc_send_component_table() rather than a
"firmware too old" indication.
- For a full-image flash, pdsc_set_component_name() finds no matches,
so devlink_flash_update_status_notify() ends up reporting empty
component strings throughout the flash.
- pdsc_pldm_firmware_update() resets num_components = 0 at the end of
every flash, so the dev_warn and the same downstream paths recur on
every subsequent attempt.
Should this path return an error such as -EOPNOTSUPP so the documented
intent matches what callers see?
[ ... ]
> +static int pdsc_devcmd_send_component(struct pdsc *pdsc,
> + struct pds_core_flash_component *info,
> + u16 info_sz, dma_addr_t addr, u32 length,
> + u32 offset, u16 slot_id,
> + union pds_core_dev_comp *comp)
> +{
> + union pds_core_dev_cmd cmd = {
> + .send_component.opcode = PDS_CORE_CMD_SEND_COMPONENT,
> + .send_component.ver = 1,
> + .send_component.operation = PDS_CORE_SEND_COMPONENT_START,
> + .send_component.data_pa = cpu_to_le64(addr),
> + .send_component.data_len = cpu_to_le32(length),
> + .send_component.offset = cpu_to_le32(offset),
> + .send_component.slot_id = slot_id,
> + };
> + unsigned long timeout = 300 * HZ;
> + unsigned long start_time;
> + unsigned long end_time;
> + int err;
> +
> + start_time = jiffies;
> + end_time = start_time + timeout;
> + do {
> + /* prevent noisy/benign devcmd failures */
> + err = pdsc_devcmd_with_data_nomsg(pdsc, &cmd, info, info_sz,
> + comp, 60);
> + if (err != -EAGAIN)
> + break;
> +
> + /* if required, subsequent commands check status of
> + * PDS_CORE_CMD_SEND_COMPONENT command, which returns
> + * EAGAIN while the command is still running,
> + * else we get the final command status.
> + */
> + cmd.send_component.operation = PDS_CORE_SEND_COMPONENT_STATUS;
> + msleep(20);
> + } while (time_before(jiffies, end_time));
> +
> + if (err == -EAGAIN || err == -ETIMEDOUT)
> + dev_err(pdsc->dev, "PDS_CORE_CMD_SEND_COMPONENT timed out\n");
> +
> + return err;
> +}
[ ... ]
> +static int pdsc_flash_component(struct pldmfw *context,
> + struct pldmfw_component *component)
> +{
[ ... ]
> + err = pdsc_devcmd_send_component(pdsc, component_info, info_sz,
> + dma_addr, copy_sz, offset,
> + slot_id, &comp);
> + dma_unmap_single(dev, dma_addr, copy_sz, DMA_TO_DEVICE);
> + kfree(component_data);
[High]
On the -EAGAIN or -ETIMEDOUT exit from pdsc_devcmd_send_component(),
can the device still be DMA-reading from this buffer when the unmap
and kfree happen here?
The polling loop in pdsc_devcmd_send_component() runs up to 300 seconds
and the only exits when the device is still busy are -EAGAIN (the
status query explicitly says START has not completed) or -ETIMEDOUT.
No abort or cancel opcode is issued, and the wire format documented
for struct pds_core_send_component_cmd hands data_pa to the device on
the START call.
Looking at __pdsc_devcmd_locked() in pds_core/dev.c:
if ((err == -ENXIO || err == -ETIMEDOUT) && pdsc->wq)
queue_work(pdsc->wq, &pdsc->health_work);
-EAGAIN does not even trigger health_work, so the device is not
treated as hung in that case.
Once control returns to pdsc_flash_component():
err = pdsc_devcmd_send_component(...);
dma_unmap_single(dev, dma_addr, copy_sz, DMA_TO_DEVICE);
kfree(component_data);
If the firmware finishes the flash a moment later and reads from
data_pa, on IOMMU systems the unmap can race the DMA and produce
IOMMU faults; on non-IOMMU systems the device can DMA-read from
freed slab memory belonging to a later allocation.
Should this path leak the buffer, reset the device, or send an abort
opcode to the firmware before unmapping and freeing on the busy
timeout exits?
[ ... ]
next prev parent reply other threads:[~2026-05-20 23:44 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-29 8:28 [PATCH net-next 0/6] pds_core: Add PLDM firmware update and host backed memory support Nikhil P. Rao
2026-04-29 8:28 ` [PATCH net-next 1/6] pds_core: add support for quiet devcmd failures Nikhil P. Rao
2026-04-29 8:28 ` [PATCH net-next 2/6] pds_core: add support for identity version 2 Nikhil P. Rao
2026-04-29 8:28 ` [PATCH net-next 3/6] pds_core: add PLDM firmware update support via devlink flash Nikhil P. Rao
2026-05-01 1:05 ` Jakub Kicinski
2026-05-01 20:03 ` Rao, Nikhil
2026-04-29 8:28 ` [PATCH net-next 4/6] pds_core: add PLDM component info display Nikhil P. Rao
2026-04-29 8:28 ` [PATCH net-next 5/6] pds_core: add host backed memory support for firmware Nikhil P. Rao
2026-04-29 8:28 ` [PATCH net-next 6/6] pds_core: add debugfs support for host backed memory Nikhil P. Rao
2026-05-16 2:42 ` [PATCH net-next v2 0/6] PLDM Firmware Update Support for pds_core Nikhil P. Rao
2026-05-16 2:42 ` [PATCH net-next v2 1/6] pds_core: add support for quiet devcmd failures Nikhil P. Rao
2026-05-16 2:42 ` [PATCH net-next v2 2/6] pds_core: add support for identity version 2 Nikhil P. Rao
2026-05-20 23:44 ` Jakub Kicinski
2026-05-16 2:42 ` [PATCH net-next v2 3/6] pds_core: add PLDM firmware update support via devlink flash Nikhil P. Rao
2026-05-20 23:44 ` Jakub Kicinski [this message]
2026-05-16 2:42 ` [PATCH net-next v2 4/6] pds_core: add PLDM component info display Nikhil P. Rao
2026-05-20 23:44 ` Jakub Kicinski
2026-05-20 23:47 ` Jakub Kicinski
2026-05-16 2:42 ` [PATCH net-next v2 5/6] pds_core: add host backed memory support for firmware Nikhil P. Rao
2026-05-20 23:44 ` Jakub Kicinski
2026-05-16 2:42 ` [PATCH net-next v2 6/6] pds_core: add debugfs support for host backed memory Nikhil P. Rao
2026-05-20 23:44 ` Jakub Kicinski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260520234432.567056-1-kuba@kernel.org \
--to=kuba@kernel.org \
--cc=Vamsi.Atluri@amd.com \
--cc=andrew+netdev@lunn.ch \
--cc=brett.creeley@amd.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=eric.joyner@amd.com \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=nikhil.rao@amd.com \
--cc=pabeni@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.