From: Simon Horman <horms@kernel.org>
To: nikhil.rao@amd.com
Cc: Simon Horman <horms@kernel.org>,
netdev@vger.kernel.org, brett.creeley@amd.com,
andrew+netdev@lunn.ch, davem@davemloft.net, edumazet@google.com,
kuba@kernel.org, pabeni@redhat.com, eric.joyner@amd.com
Subject: Re: [PATCH v4 4/6] pds_core: add PLDM component info display
Date: Mon, 15 Jun 2026 17:08:45 +0100 [thread overview]
Message-ID: <20260615160842.776643-4-horms@kernel.org> (raw)
In-Reply-To: <20260614050052.1048328-5-nikhil.rao@amd.com>
This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
Full review at: https://netdev-ai.bots.linux.dev/sashiko/
Simon says: This review seems to be a bit of a mixed bag.
I've annotated the issues it raises that I'd appreciate
you looking over, and the rest, which I don't feel fall
in that category.
---
pds_core: add PLDM component info display
This patch reports per-component firmware versions through devlink info on
v2+ devices, replacing the single fw running version with names derived
from the firmware-supplied component list and slot/flag fields.
Simon says: I'm not particularly concerned about the issue raised below -
I'm forwarding this review because of others raised further down.
[Low]
Before going to the diff, this patch immediately renames user-visible
component names introduced in the immediately preceding commit in the
same series ("pds_core: add PLDM firmware update support via devlink
flash"):
fw.mainfw -> fw
fw.mainfw.gold -> fw.gold
Could these two patches be folded together (or reordered) so the series
introduces the final naming once? Anyone bisecting between the two
commits would land on the obsolete names.
> diff --git a/drivers/net/ethernet/amd/pds_core/devlink.c b/drivers/net/ethernet/amd/pds_core/devlink.c
> index 3b763ee1715e..2f40b97affd6 100644
> --- a/drivers/net/ethernet/amd/pds_core/devlink.c
> +++ b/drivers/net/ethernet/amd/pds_core/devlink.c
> @@ -93,14 +93,105 @@ int pdsc_dl_flash_update(struct devlink *dl,
> return pdsc_firmware_update(pdsc, params, extack);
> }
>
> +static int pdsc_dl_report_component(struct devlink_info_req *req,
> + struct pds_core_fw_component_info *info)
> +{
> + enum devlink_info_version_type ver_type;
> + u16 flags = le16_to_cpu(info->flags);
> + char *ver = info->version;
> + const char *name;
> + char buf[32];
> +
> + /* Main firmware is reported as generic "fw" */
> + if (info->component_type == PDS_CORE_FW_TYPE_MAIN) {
> + if (info->slot_id == PDS_CORE_FW_SLOT_GOLD)
> + snprintf(buf, sizeof(buf), "fw.gold");
> + else
> + snprintf(buf, sizeof(buf), "fw");
> + } else {
> + name = pdsc_fw_type_to_name(info->component_type);
> + if (!name)
> + return 0;
Simon says: Ditto
[Low]
When firmware reports a component_type that pdsc_fw_type_names[] does
not know about, this returns 0 and silently drops the entry. Should
this at least dev_warn_once() so a newer firmware advertising a new
component type does not vanish from devlink output without any
operator-visible signal?
> +
> + if (info->slot_id == PDS_CORE_FW_SLOT_GOLD)
> + snprintf(buf, sizeof(buf), "fw.%s.gold", name);
> + else
> + snprintf(buf, sizeof(buf), "fw.%s", name);
> + }
Simon says: And this one seems intentional, so I think
this portion of the review can be ignored.
[Low]
This branch appends ".gold" to any non-MAIN component when slot_id ==
PDS_CORE_FW_SLOT_GOLD, which produces names like fw.bootloader.gold,
fw.uboot.gold and fw.cpld.gold. The Documentation table updated in
this same patch only lists fw.gold as the stored variant, and the
example output only mentions fw.uboot.gold.
Should the gold suffix be restricted to MAIN, or should the
documentation table be expanded to cover any component_type also
appearing as a .gold variant?
> +
> + ver_type = DEVLINK_INFO_VERSION_TYPE_NONE;
> + if (flags & PDS_CORE_FW_COMPONENT_INFO_F_UPDATE_BY_NAME)
> + ver_type = DEVLINK_INFO_VERSION_TYPE_COMPONENT;
> +
> + if (flags & PDS_CORE_FW_COMPONENT_INFO_F_FIXED) {
[ ... ]
> + return 0;
> +}
> +
> +static int pdsc_dl_component_info_get(struct devlink *dl,
> + struct devlink_info_req *req,
> + struct netlink_ext_ack *extack)
> +{
> + struct pds_core_component_list_info *list_info;
> + struct pdsc *pdsc = devlink_priv(dl);
> + u8 num_components;
> + int err;
> + int i;
> +
> + if (!pdsc->fw_components.num_components) {
> + err = pdsc_get_component_info(pdsc);
> + if (err) {
> + dev_err(pdsc->dev, "Failed to get component_info %pe\n",
> + ERR_PTR(err));
> + return err;
> + }
> + }
Simon says: ditto
[Medium, Medium]
Two questions about the cache check above.
First, pdsc_get_component_info() has a branch that returns 0 without
populating num_components when firmware reports
comp.get_component_info.ver == 0:
if (comp.get_component_info.ver == 0) {
dev_warn_once(pdsc->dev,
"Incompatible get_component_info version %u ...\n",
comp.get_component_info.ver);
err = 0;
goto out;
}
Since pdsc->fw_components.num_components stays 0 on this path, every
subsequent devlink dev info request re-enters pdsc_get_component_info()
and runs through kmalloc / kzalloc(PDS_PAGE_SIZE) / dma_map_single() /
a full devcmd round-trip with pdsc->devcmd_timeout * 2 under
devcmd_lock / dma_unmap_single(). dev_warn_once() suppresses the
warning after the first hit, so the repetition is invisible. Should
there be an "unsupported" sentinel that prevents re-querying once v0
firmware is detected?
Simon says: However, the issue flagged below seems to warrant
investigation in the context of this patch-set.
Second, pdsc->fw_components is only invalidated at the end of
pdsc_firmware_update() ("pdsc->fw_components.num_components = 0;").
If firmware versions change through any other path (FW recovery via
health_work, PCI reset via pci_reset_work, side-channel upgrade) the
cached list stays around and devlink dev info reports stale versions.
Should those reset paths also clear pdsc->fw_components?
> +
> + list_info = &pdsc->fw_components;
> + num_components = min_t(u16, list_info->num_components,
> + le16_to_cpu(pdsc->dev_ident.max_fw_slots));
Simon says: Likewise, the one below seems to warrant investigation too.
[Medium]
num_components has already been bounded against
PDS_CORE_FW_COMPONENT_LIST_LEN inside pdsc_get_component_info().
max_fw_slots looks like a slot count (gold/main-a/main-b), but the
component list also contains cpld/bootloader/uboot/etc. If firmware
reports six components on a three-slot device, this min_t() silently
truncates legitimate component entries from the devlink output.
Is this cap intentional, and if so should truncation be logged? If
not, can the cap simply be dropped?
> + for (i = 0; i < num_components; i++) {
> + err = pdsc_dl_report_component(req, &list_info->info[i]);
> + if (err)
> + return err;
> + }
> +
> + return 0;
> +}
> +
> static char *fw_slotnames[] = {
> "fw.goldfw",
> "fw.mainfwa",
> "fw.mainfwb",
> };
>
> -int pdsc_dl_info_get(struct devlink *dl, struct devlink_info_req *req,
> - struct netlink_ext_ack *extack)
> +static int pdsc_dl_fw_list_info_get(struct devlink *dl,
> + struct devlink_info_req *req,
> + struct netlink_ext_ack *extack)
> {
[ ... ]
> @@ -134,12 +225,49 @@ int pdsc_dl_info_get(struct devlink *dl, struct devlink_info_req *req,
> return err;
> }
>
> + return 0;
> +}
> +
> +static int pdsc_dl_info_get_v1(struct devlink *dl,
> + struct devlink_info_req *req,
> + struct netlink_ext_ack *extack)
> +{
> + struct pdsc *pdsc = devlink_priv(dl);
> + int err;
> +
> + err = pdsc_dl_fw_list_info_get(dl, req, extack);
> + if (err)
> + dev_warn_once(pdsc->dev, "Failed to get fw list: %pe\n",
> + ERR_PTR(err));
Simon says: Again, the one below seems to warrant investigation.
[Medium]
Pre-patch, devlink_info_version_stored_put() failures inside the fw
list loop returned to the devlink layer immediately. After this
refactor, pdsc_dl_fw_list_info_get() returns the error but
pdsc_dl_info_get_v1() converts it to dev_warn_once() and continues to
append the running fw version and the ASIC fields.
Does this mean a mid-loop failure (for example -EMSGSIZE from netlink
attribute append) is now masked, leaving a partially populated devlink
info reply that gets reported as success?
> +
> + /* Version 1: report fw from dev_info (running only) */
> err = devlink_info_version_running_put(req,
> DEVLINK_INFO_VERSION_GENERIC_FW,
> pdsc->dev_info.fw_version);
> if (err)
> return err;
>
> + return 0;
> +}
> +
> +int pdsc_dl_info_get(struct devlink *dl, struct devlink_info_req *req,
> + struct netlink_ext_ack *extack)
> +{
> + struct pdsc *pdsc = devlink_priv(dl);
> + char buf[32];
> + int err;
> +
> + if (pdsc->dev_ident.version >= PDS_CORE_IDENTITY_VERSION_2) {
> + err = pdsc_dl_component_info_get(dl, req, extack);
> + if (err)
> + dev_warn_once(pdsc->dev, "Failed to get component info: %pe\n",
> + ERR_PTR(err));
> + } else {
> + err = pdsc_dl_info_get_v1(dl, req, extack);
> + if (err)
> + return err;
> + }
Simon says: but this one seems to be more a question of style that substance.
[Medium]
On v2+ devices, errors from pdsc_dl_component_info_get() are swallowed
into dev_warn_once() and the function falls through to the ASIC
fields. Pre-patch behavior always reported pdsc->dev_info.fw_version
via DEVLINK_INFO_VERSION_GENERIC_FW.
After this change, transient failures inside pdsc_get_component_info()
(DMA mapping failure, -ENOMEM, the -ETIMEDOUT/-EAGAIN deferred-DMA
path, the v0-firmware return-0-without-populating branch, the
num_components > LIST_LEN -ENOMEM branch), or even firmware enumerating
components without setting PDS_CORE_FW_COMPONENT_INFO_F_RUNNING on the
MAIN component, cause devlink dev info to show only asic.id, asic.rev
and the serial number with no firmware version reported at all.
Could the v2 path either retain the unconditional running fw fallback
from dev_info.fw_version, or fall through to pdsc_dl_info_get_v1() on
failure, so devlink info never silently loses the fw running version?
Also, since dev_warn_once() suppresses the warning across the lifetime
of the device, callers cannot tell that the netlink reply is partial:
a partially populated message gets the ASIC fields appended and is
reported as success.
> +
> snprintf(buf, sizeof(buf), "0x%x", pdsc->dev_info.asic_type);
> err = devlink_info_version_fixed_put(req,
> DEVLINK_INFO_VERSION_GENERIC_ASIC_ID,
next prev parent reply other threads:[~2026-06-15 16:19 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-14 5:00 [PATCH net-next v4 0/6] pds_core: Add PLDM firmware update and host backed memory support Nikhil P. Rao
2026-06-14 5:00 ` [PATCH v4 1/6] pds_core: add support for quiet devcmd failures Nikhil P. Rao
2026-06-15 15:34 ` Simon Horman
2026-06-15 15:37 ` Simon Horman
2026-06-14 5:00 ` [PATCH v4 2/6] pds_core: add support for identity version 2 Nikhil P. Rao
2026-06-14 5:00 ` [PATCH v4 3/6] pds_core: add PLDM firmware update support via devlink flash Nikhil P. Rao
2026-06-15 15:36 ` Simon Horman
2026-06-14 5:00 ` [PATCH v4 4/6] pds_core: add PLDM component info display Nikhil P. Rao
2026-06-15 16:08 ` Simon Horman [this message]
2026-06-14 5:00 ` [PATCH v4 5/6] pds_core: add host backed memory support for firmware Nikhil P. Rao
2026-06-15 16:19 ` Simon Horman
2026-06-14 5:00 ` [PATCH v4 6/6] pds_core: add debugfs support for host backed memory Nikhil P. Rao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260615160842.776643-4-horms@kernel.org \
--to=horms@kernel.org \
--cc=andrew+netdev@lunn.ch \
--cc=brett.creeley@amd.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=eric.joyner@amd.com \
--cc=kuba@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=nikhil.rao@amd.com \
--cc=pabeni@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox