* [net-next v36] mctp pcc: Implement MCTP over PCC Transport
@ 2026-03-25 19:42 Adam Young
2026-03-26 22:29 ` Jakub Kicinski
2026-03-29 21:15 ` Jakub Kicinski
0 siblings, 2 replies; 8+ messages in thread
From: Adam Young @ 2026-03-25 19:42 UTC (permalink / raw)
To: Jeremy Kerr, Matt Johnston, Andrew Lunn, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: netdev, linux-kernel, Sudeep Holla, Jonathan Cameron, Huisong Li
Implementation of network driver for
Management Component Transport Protocol(MCTP)
over Platform Communication Channel(PCC)
DMTF DSP:0292
Link: https://www.dmtf.org/sites/default/files/standards/documents/DSP0292_1.0.0WIP50.pdf
The transport mechanism is called Platform Communication Channels (PCC)
is part of the ACPI spec:
Link: https://uefi.org/htmlspecs/ACPI_Spec_6_4_html/14_Platform_Communications_Channel/Platform_Comm_Channel.html
The PCC mechanism is managed via a mailbox implemented at
drivers/mailbox/pcc.c
MCTP devices are specified via ACPI by entries in DSDT/SSDT and
reference channels specified in the PCCT. Messages are sent on a type
3 and received on a type 4 channel. Communication with other devices
use the PCC based doorbell mechanism; a shared memory segment with a
corresponding interrupt and a memory register used to trigger remote
interrupts.
Unlike the existing PCC Type 2 based drivers, the mssg parameter to
mbox_send_msg is actively used. The data section of the struct sk_buff
that contains the outgoing packet is sent to the mailbox, already
properly formatted as a PCC exctended message.
If the mailbox ring buffer is full, the driver stops the incoming
packet queues until a message has been sent, freeing space in the
ring buffer.
When the Type 3 channel outbox receives a txdone response interrupt,
it consumes the outgoing sk_buff, allowing it to be freed.
Bringing up an interface creates the channel between the network driver
and the mailbox driver. This enables communication with the remote
endpoint, to include the receipt of new messages. Bringing down an
interface removes the channel, and no new messages can be delivered.
Stopping the interface also frees any packets that are cached in the
mailbox ringbuffer.
PCC is based on a shared buffer and a set of I/O mapped memory locations
that the Spec calls registers. This mechanism exists regardless of the
existence of the driver. If the user has the ability to map these
physical location to virtual locations, they have the ability to drive the
hardware. Thus, there is a security aspect to this mechanism that extends
beyond the responsibilities of the operating system.
If the hardware does not expose the PCC in the ACPI table, this device
will never be enabled. Thus it is only an issue on hardware that does
support PCC. In that case, it is up to the remote controller to sanitize
communication; MCTP will be exposed as a socket interface, and userland
can send any crafted packet it wants. It would also be incumbent on
the hardware manufacturer to allow the end user to disable MCTP over PCC
communication if they did not want to expose it.
Signed-off-by: Adam Young <admiyo@os.amperecomputing.com>
---
MAINTAINERS | 5 +
drivers/net/mctp/Kconfig | 13 ++
drivers/net/mctp/Makefile | 1 +
drivers/net/mctp/mctp-pcc.c | 334 ++++++++++++++++++++++++++++++++++++
4 files changed, 353 insertions(+)
create mode 100644 drivers/net/mctp/mctp-pcc.c
diff --git a/MAINTAINERS b/MAINTAINERS
index a09bf30a057d..2e57ca4d61b8 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -15286,6 +15286,11 @@ F: include/net/mctpdevice.h
F: include/net/netns/mctp.h
F: net/mctp/
+MANAGEMENT COMPONENT TRANSPORT PROTOCOL (MCTP) over PCC (MCTP-PCC) Driver
+M: Adam Young <admiyo@os.amperecomputing.com>
+S: Maintained
+F: drivers/net/mctp/mctp-pcc.c
+
MAPLE TREE
M: Liam R. Howlett <Liam.Howlett@oracle.com>
R: Alice Ryhl <aliceryhl@google.com>
diff --git a/drivers/net/mctp/Kconfig b/drivers/net/mctp/Kconfig
index cf325ab0b1ef..77cd4091050c 100644
--- a/drivers/net/mctp/Kconfig
+++ b/drivers/net/mctp/Kconfig
@@ -47,6 +47,19 @@ config MCTP_TRANSPORT_I3C
A MCTP protocol network device is created for each I3C bus
having a "mctp-controller" devicetree property.
+config MCTP_TRANSPORT_PCC
+ tristate "MCTP PCC transport"
+ depends on ACPI
+ help
+ Provides a driver to access MCTP devices over PCC transport,
+ A MCTP protocol network device is created via ACPI for each
+ entry in the DSDT/SSDT that matches the identifier. The Platform
+ communication channels are selected from the corresponding
+ entries in the PCCT.
+
+ Say y here if you need to connect to MCTP endpoints over PCC. To
+ compile as a module, use m; the module will be called mctp-pcc.
+
config MCTP_TRANSPORT_USB
tristate "MCTP USB transport"
depends on USB
diff --git a/drivers/net/mctp/Makefile b/drivers/net/mctp/Makefile
index c36006849a1e..0a591299ffa9 100644
--- a/drivers/net/mctp/Makefile
+++ b/drivers/net/mctp/Makefile
@@ -1,4 +1,5 @@
obj-$(CONFIG_MCTP_SERIAL) += mctp-serial.o
obj-$(CONFIG_MCTP_TRANSPORT_I2C) += mctp-i2c.o
obj-$(CONFIG_MCTP_TRANSPORT_I3C) += mctp-i3c.o
+obj-$(CONFIG_MCTP_TRANSPORT_PCC) += mctp-pcc.o
obj-$(CONFIG_MCTP_TRANSPORT_USB) += mctp-usb.o
diff --git a/drivers/net/mctp/mctp-pcc.c b/drivers/net/mctp/mctp-pcc.c
new file mode 100644
index 000000000000..fe08fd041d9c
--- /dev/null
+++ b/drivers/net/mctp/mctp-pcc.c
@@ -0,0 +1,334 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * mctp-pcc.c - Driver for MCTP over PCC.
+ * Copyright (c) 2024-2026, Ampere Computing LLC
+ *
+ */
+
+/* Implementation of MCTP over PCC DMTF Specification DSP0256
+ * https://www.dmtf.org/sites/default/files/standards/documents/DSP0292_1.0.0WIP50.pdf
+ */
+
+#include <linux/acpi.h>
+#include <linux/hrtimer.h>
+#include <linux/if_arp.h>
+#include <linux/init.h>
+#include <linux/kernel.h>
+#include <linux/mailbox_client.h>
+#include <linux/module.h>
+#include <linux/netdevice.h>
+#include <linux/platform_device.h>
+#include <linux/skbuff.h>
+#include <linux/string.h>
+
+#include <acpi/acpi_bus.h>
+#include <acpi/acpi_drivers.h>
+#include <acpi/acrestyp.h>
+#include <acpi/actbl.h>
+#include <acpi/pcc.h>
+#include <net/mctp.h>
+#include <net/mctpdevice.h>
+
+#define MCTP_SIGNATURE "MCTP"
+#define MCTP_SIGNATURE_LENGTH (sizeof(MCTP_SIGNATURE) - 1)
+#define MCTP_MIN_MTU 68
+#define PCC_DWORD_TYPE 0x0c
+
+struct mctp_pcc_mailbox {
+ u32 index;
+ struct pcc_mbox_chan *chan;
+ struct mbox_client client;
+};
+
+/* The netdev structure. One of these per PCC adapter. */
+struct mctp_pcc_ndev {
+ struct net_device *ndev;
+ struct acpi_device *acpi_device;
+ struct mctp_pcc_mailbox inbox;
+ struct mctp_pcc_mailbox outbox;
+};
+
+static void mctp_pcc_client_rx_callback(struct mbox_client *cl, void *mssg)
+{
+ struct acpi_pcct_ext_pcc_shared_memory pcc_header;
+ struct mctp_pcc_ndev *mctp_pcc_ndev;
+ struct mctp_pcc_mailbox *inbox;
+ struct mctp_skb_cb *cb;
+ struct sk_buff *skb;
+ int size;
+
+ mctp_pcc_ndev = container_of(cl, struct mctp_pcc_ndev, inbox.client);
+ inbox = &mctp_pcc_ndev->inbox;
+ memcpy_fromio(&pcc_header, inbox->chan->shmem, sizeof(pcc_header));
+
+ if (pcc_header.length <= 0) {
+ dev_dstats_rx_dropped(mctp_pcc_ndev->ndev);
+ return;
+ }
+
+ size = pcc_header.length - sizeof(pcc_header.command) + sizeof(pcc_header);
+ if (memcmp(&pcc_header.command, MCTP_SIGNATURE, MCTP_SIGNATURE_LENGTH) != 0) {
+ dev_dstats_rx_dropped(mctp_pcc_ndev->ndev);
+ return;
+ }
+
+ skb = netdev_alloc_skb(mctp_pcc_ndev->ndev, size);
+ if (!skb) {
+ dev_dstats_rx_dropped(mctp_pcc_ndev->ndev);
+ return;
+ }
+ skb_put(skb, size);
+ skb->protocol = htons(ETH_P_MCTP);
+ memcpy_fromio(skb->data, inbox->chan->shmem, size);
+ dev_dstats_rx_add(mctp_pcc_ndev->ndev, size);
+ skb_pull(skb, sizeof(pcc_header));
+ skb_reset_mac_header(skb);
+ skb_reset_network_header(skb);
+ cb = __mctp_cb(skb);
+ cb->halen = 0;
+ netif_rx(skb);
+}
+
+static netdev_tx_t mctp_pcc_tx(struct sk_buff *skb, struct net_device *ndev)
+{
+ struct acpi_pcct_ext_pcc_shared_memory *pcc_header;
+ struct mctp_pcc_ndev *mpnd = netdev_priv(ndev);
+ int len = skb->len;
+ int rc;
+
+ rc = skb_cow_head(skb, sizeof(*pcc_header));
+ if (rc) {
+ dev_dstats_tx_dropped(ndev);
+ kfree_skb(skb);
+ return NETDEV_TX_OK;
+ }
+
+ pcc_header = skb_push(skb, sizeof(*pcc_header));
+ pcc_header->signature = PCC_SIGNATURE | mpnd->outbox.index;
+ pcc_header->flags = PCC_CMD_COMPLETION_NOTIFY;
+ memcpy(&pcc_header->command, MCTP_SIGNATURE, MCTP_SIGNATURE_LENGTH);
+ pcc_header->length = len + MCTP_SIGNATURE_LENGTH;
+
+ rc = mbox_send_message(mpnd->outbox.chan->mchan, skb);
+ if (rc < 0) {
+ //Remove the header in case it gets sent again
+ skb_pull(skb, sizeof(*pcc_header));
+ netif_stop_queue(ndev);
+ return NETDEV_TX_BUSY;
+ }
+
+ return NETDEV_TX_OK;
+}
+
+static void mctp_pcc_tx_prepare(struct mbox_client *cl, void *mssg)
+{
+ struct mctp_pcc_ndev *mctp_pcc_ndev;
+ struct mctp_pcc_mailbox *outbox;
+ struct sk_buff *skb = mssg;
+
+ mctp_pcc_ndev = container_of(cl, struct mctp_pcc_ndev, outbox.client);
+ outbox = &mctp_pcc_ndev->outbox;
+
+ if (!skb)
+ return;
+
+ if (skb->len > outbox->chan->shmem_size) {
+ dev_dstats_tx_dropped(mctp_pcc_ndev->ndev);
+ return;
+ }
+ memcpy_toio(outbox->chan->shmem, skb->data, skb->len);
+}
+
+static void mctp_pcc_tx_done(struct mbox_client *c, void *mssg, int r)
+{
+ struct mctp_pcc_ndev *mctp_pcc_ndev;
+ struct sk_buff *skb = mssg;
+
+ mctp_pcc_ndev = container_of(c, struct mctp_pcc_ndev, outbox.client);
+ dev_dstats_tx_add(mctp_pcc_ndev->ndev, skb->len);
+ dev_consume_skb_any(skb);
+ netif_wake_queue(mctp_pcc_ndev->ndev);
+}
+
+static int mctp_pcc_ndo_open(struct net_device *ndev)
+{
+ struct mctp_pcc_ndev *mctp_pcc_ndev = netdev_priv(ndev);
+ struct mctp_pcc_mailbox *outbox, *inbox;
+
+ outbox = &mctp_pcc_ndev->outbox;
+ inbox = &mctp_pcc_ndev->inbox;
+
+ outbox->chan = pcc_mbox_request_channel(&outbox->client, outbox->index);
+ if (IS_ERR(outbox->chan))
+ return PTR_ERR(outbox->chan);
+
+ inbox->client.rx_callback = mctp_pcc_client_rx_callback;
+ inbox->chan = pcc_mbox_request_channel(&inbox->client, inbox->index);
+ if (IS_ERR(inbox->chan)) {
+ pcc_mbox_free_channel(outbox->chan);
+ return PTR_ERR(inbox->chan);
+ }
+ return 0;
+}
+
+static int mctp_pcc_ndo_stop(struct net_device *ndev)
+{
+ struct mctp_pcc_ndev *mctp_pcc_ndev = netdev_priv(ndev);
+
+ pcc_mbox_free_channel(mctp_pcc_ndev->outbox.chan);
+ pcc_mbox_free_channel(mctp_pcc_ndev->inbox.chan);
+ return 0;
+}
+
+static const struct net_device_ops mctp_pcc_netdev_ops = {
+ .ndo_open = mctp_pcc_ndo_open,
+ .ndo_stop = mctp_pcc_ndo_stop,
+ .ndo_start_xmit = mctp_pcc_tx,
+};
+
+static void mctp_pcc_setup(struct net_device *ndev)
+{
+ ndev->type = ARPHRD_MCTP;
+ ndev->hard_header_len = sizeof(struct acpi_pcct_ext_pcc_shared_memory);
+ ndev->tx_queue_len = 0;
+ ndev->flags = IFF_NOARP;
+ ndev->netdev_ops = &mctp_pcc_netdev_ops;
+ ndev->needs_free_netdev = true;
+ ndev->pcpu_stat_type = NETDEV_PCPU_STAT_DSTATS;
+}
+
+struct mctp_pcc_lookup_context {
+ int index;
+ u32 inbox_index;
+ u32 outbox_index;
+};
+
+static acpi_status lookup_pcct_indices(struct acpi_resource *ares,
+ void *context)
+{
+ struct mctp_pcc_lookup_context *luc = context;
+ struct acpi_resource_address32 *addr;
+
+ if (ares->type != PCC_DWORD_TYPE)
+ return AE_OK;
+
+ addr = ACPI_CAST_PTR(struct acpi_resource_address32, &ares->data);
+ switch (luc->index) {
+ case 0:
+ luc->outbox_index = addr[0].address.minimum;
+ break;
+ case 1:
+ luc->inbox_index = addr[0].address.minimum;
+ break;
+ }
+ luc->index++;
+ return AE_OK;
+}
+
+static void mctp_cleanup_netdev(void *data)
+{
+ struct net_device *ndev = data;
+
+ mctp_unregister_netdev(ndev);
+}
+
+static int initialize_MTU(struct net_device *ndev)
+{
+ struct mctp_pcc_ndev *mctp_pcc_ndev;
+ struct mctp_pcc_mailbox *outbox;
+ struct pcc_mbox_chan *pchan;
+ int mctp_pcc_mtu;
+
+ mctp_pcc_mtu = MCTP_MIN_MTU;
+ mctp_pcc_ndev = netdev_priv(ndev);
+ outbox = &mctp_pcc_ndev->outbox;
+ pchan = pcc_mbox_request_channel(&outbox->client, outbox->index);
+ if (IS_ERR(pchan))
+ return -1;
+
+ mctp_pcc_mtu = pchan->shmem_size;
+ pcc_mbox_free_channel(pchan);
+
+ mctp_pcc_mtu = mctp_pcc_mtu - sizeof(struct acpi_pcct_ext_pcc_shared_memory);
+ ndev->mtu = MCTP_MIN_MTU;
+ ndev->max_mtu = mctp_pcc_mtu;
+ ndev->min_mtu = MCTP_MIN_MTU;
+
+ return 0;
+}
+
+static int mctp_pcc_driver_add(struct acpi_device *acpi_dev)
+{
+ struct mctp_pcc_lookup_context context = {0};
+ struct mctp_pcc_ndev *mctp_pcc_ndev;
+ struct device *dev = &acpi_dev->dev;
+ struct net_device *ndev;
+ acpi_handle dev_handle;
+ acpi_status status;
+ char name[32];
+ int rc;
+
+ dev_dbg(dev, "Adding mctp_pcc device for HID %s\n",
+ acpi_device_hid(acpi_dev));
+ dev_handle = acpi_device_handle(acpi_dev);
+ status = acpi_walk_resources(dev_handle, "_CRS", lookup_pcct_indices,
+ &context);
+ if (!ACPI_SUCCESS(status)) {
+ dev_err(dev, "FAILED to lookup PCC indexes from CRS\n");
+ return -EINVAL;
+ }
+
+ snprintf(name, sizeof(name), "mctppcc%d", context.inbox_index);
+ ndev = alloc_netdev(sizeof(*mctp_pcc_ndev), name, NET_NAME_PREDICTABLE,
+ mctp_pcc_setup);
+ if (!ndev)
+ return -ENOMEM;
+
+ mctp_pcc_ndev = netdev_priv(ndev);
+
+ mctp_pcc_ndev->inbox.index = context.inbox_index;
+ mctp_pcc_ndev->inbox.client.dev = dev;
+ mctp_pcc_ndev->outbox.index = context.outbox_index;
+ mctp_pcc_ndev->outbox.client.dev = dev;
+
+ mctp_pcc_ndev->outbox.client.tx_prepare = mctp_pcc_tx_prepare;
+ mctp_pcc_ndev->outbox.client.tx_done = mctp_pcc_tx_done;
+ mctp_pcc_ndev->acpi_device = acpi_dev;
+ mctp_pcc_ndev->ndev = ndev;
+ acpi_dev->driver_data = mctp_pcc_ndev;
+
+ rc = initialize_MTU(ndev);
+ if (rc)
+ goto free_netdev;
+
+ rc = mctp_register_netdev(ndev, NULL, MCTP_PHYS_BINDING_PCC);
+ if (rc)
+ goto free_netdev;
+
+ return devm_add_action_or_reset(dev, mctp_cleanup_netdev, ndev);
+free_netdev:
+ free_netdev(ndev);
+ return rc;
+}
+
+static const struct acpi_device_id mctp_pcc_device_ids[] = {
+ { "DMT0001" },
+ {}
+};
+
+static struct acpi_driver mctp_pcc_driver = {
+ .name = "mctp_pcc",
+ .class = "Unknown",
+ .ids = mctp_pcc_device_ids,
+ .ops = {
+ .add = mctp_pcc_driver_add,
+ },
+};
+
+module_acpi_driver(mctp_pcc_driver);
+
+MODULE_DEVICE_TABLE(acpi, mctp_pcc_device_ids);
+
+MODULE_DESCRIPTION("MCTP PCC ACPI device");
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Adam Young <admiyo@os.amperecomputing.com>");
--
2.43.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [net-next v36] mctp pcc: Implement MCTP over PCC Transport
2026-03-25 19:42 Adam Young
@ 2026-03-26 22:29 ` Jakub Kicinski
2026-03-29 21:15 ` Jakub Kicinski
1 sibling, 0 replies; 8+ messages in thread
From: Jakub Kicinski @ 2026-03-26 22:29 UTC (permalink / raw)
To: Adam Young
Cc: Jeremy Kerr, Matt Johnston, Andrew Lunn, David S. Miller,
Eric Dumazet, Paolo Abeni, netdev, linux-kernel, Sudeep Holla,
Jonathan Cameron, Huisong Li
On Wed, 25 Mar 2026 15:42:54 -0400 Adam Young wrote:
> Implementation of network driver for
> Management Component Transport Protocol(MCTP)
> over Platform Communication Channel(PCC)
reminder: at most 1 version within 24h on netdev, please
https://www.kernel.org/doc/html/next/process/maintainer-netdev.html#tl-dr
--
pv-bot: 24h
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [net-next v36] mctp pcc: Implement MCTP over PCC Transport
2026-03-25 19:42 Adam Young
2026-03-26 22:29 ` Jakub Kicinski
@ 2026-03-29 21:15 ` Jakub Kicinski
1 sibling, 0 replies; 8+ messages in thread
From: Jakub Kicinski @ 2026-03-29 21:15 UTC (permalink / raw)
To: admiyo
Cc: Jakub Kicinski, davem, jk, Jonathan.Cameron, netdev, linux-kernel,
sudeep.holla, lihuisong, andrew+netdev, pabeni, matt, edumazet
This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
For local reproduction steps see https://netdev-ai.bots.linux.dev/ai-local.html
---
mctp pcc: Implement MCTP over PCC Transport
This commit implements a network driver for the Management Component
Transport Protocol (MCTP) over Platform Communication Channel (PCC).
It creates MCTP network devices via ACPI that communicate through shared
memory and mailbox mechanisms.
Link: https://www.dmtf.org/sites/default/files/standards/documents/DSP0292_1.0.0WIP50.pdf
Link: https://uefi.org/htmlspecs/ACPI_Spec_6_4_html/14_Platform_Communications_Channel/Platform_Comm_Channel.html
> diff --git a/drivers/net/mctp/Kconfig b/drivers/net/mctp/Kconfig
> index cf325ab0b1ef..77cd4091050c 100644
> --- a/drivers/net/mctp/Kconfig
> +++ b/drivers/net/mctp/Kconfig
> @@ -47,6 +47,19 @@ config MCTP_TRANSPORT_I3C
> A MCTP protocol network device is created for each I3C bus
> having a "mctp-controller" devicetree property.
>
> +config MCTP_TRANSPORT_PCC
> + tristate "MCTP PCC transport"
> + depends on ACPI
Can MCTP_TRANSPORT_PCC be selected when CONFIG_PCC is disabled?
The Kconfig depends only on ACPI but uses pcc_mbox_request_channel() and
pcc_mbox_free_channel() from the PCC mailbox driver (drivers/mailbox/pcc.c).
When CONFIG_PCC is not set, include/acpi/pcc.h provides stub functions
that make pcc_mbox_request_channel() unconditionally return ERR_PTR(-ENODEV).
This means the driver compiles but initialize_MTU() fails, ndo_open() fails,
and no MCTP-over-PCC communication is possible.
> + help
> + Provides a driver to access MCTP devices over PCC transport,
> + A MCTP protocol network device is created via ACPI for each
> + entry in the DSDT/SSDT that matches the identifier. The Platform
> + communication channels are selected from the corresponding
> + entries in the PCCT.
> +
> + Say y here if you need to connect to MCTP endpoints over PCC. To
> + compile as a module, use m; the module will be called mctp-pcc.
[ ... ]
> diff --git a/drivers/net/mctp/mctp-pcc.c b/drivers/net/mctp/mctp-pcc.c
> new file mode 100644
> index 000000000000..fe08fd041d9c
> --- /dev/null
> +++ b/drivers/net/mctp/mctp-pcc.c
[ ... ]
> +static void mctp_pcc_client_rx_callback(struct mbox_client *cl, void *mssg)
> +{
> + struct acpi_pcct_ext_pcc_shared_memory pcc_header;
> + struct mctp_pcc_ndev *mctp_pcc_ndev;
> + struct mctp_pcc_mailbox *inbox;
> + struct mctp_skb_cb *cb;
> + struct sk_buff *skb;
> + int size;
> +
> + mctp_pcc_ndev = container_of(cl, struct mctp_pcc_ndev, inbox.client);
> + inbox = &mctp_pcc_ndev->inbox;
> + memcpy_fromio(&pcc_header, inbox->chan->shmem, sizeof(pcc_header));
> +
> + if (pcc_header.length <= 0) {
> + dev_dstats_rx_dropped(mctp_pcc_ndev->ndev);
> + return;
> + }
> +
> + size = pcc_header.length - sizeof(pcc_header.command) + sizeof(pcc_header);
> + if (memcmp(&pcc_header.command, MCTP_SIGNATURE, MCTP_SIGNATURE_LENGTH) != 0) {
> + dev_dstats_rx_dropped(mctp_pcc_ndev->ndev);
> + return;
> + }
> +
> + skb = netdev_alloc_skb(mctp_pcc_ndev->ndev, size);
> + if (!skb) {
> + dev_dstats_rx_dropped(mctp_pcc_ndev->ndev);
> + return;
> + }
> + skb_put(skb, size);
> + skb->protocol = htons(ETH_P_MCTP);
> + memcpy_fromio(skb->data, inbox->chan->shmem, size);
Can pcc_header.length exceed inbox->chan->shmem_size?
The pcc_header is read from shared memory controlled by the remote endpoint.
The only validation is pcc_header.length <= 0. The size is calculated as:
size = pcc_header.length - sizeof(pcc_header.command) + sizeof(pcc_header)
= pcc_header.length + 12
If the remote endpoint sets pcc_header.length to a value larger than
shmem_size, memcpy_fromio() reads beyond the ioremapped region.
> + dev_dstats_rx_add(mctp_pcc_ndev->ndev, size);
> + skb_pull(skb, sizeof(pcc_header));
> + skb_reset_mac_header(skb);
> + skb_reset_network_header(skb);
> + cb = __mctp_cb(skb);
> + cb->halen = 0;
> + netif_rx(skb);
> +}
[ ... ]
> +static int mctp_pcc_ndo_stop(struct net_device *ndev)
> +{
> + struct mctp_pcc_ndev *mctp_pcc_ndev = netdev_priv(ndev);
> +
> + pcc_mbox_free_channel(mctp_pcc_ndev->outbox.chan);
> + pcc_mbox_free_channel(mctp_pcc_ndev->inbox.chan);
> + return 0;
> +}
Does mctp_pcc_ndo_stop() leak SKBs queued in the mailbox?
Looking at mbox_free_channel() in drivers/mailbox/mailbox.c, the
documentation states "The queued TX requests are simply aborted, no
callbacks are made." The implementation sets chan->cl = NULL and
chan->active_req = NULL but does not free the SKBs stored in
chan->msg_data[].
The mailbox ring buffer can hold up to MBOX_TX_QUEUE_LEN (20) entries.
When the network interface is stopped, any SKBs queued there are leaked
because:
1. The networking stack already returned NETDEV_TX_OK, so it won't free them
2. tx_done will never be called since chan->cl is NULL
The commit message claims "Stopping the interface also frees any packets
that are cached in the mailbox ringbuffer" but this doesn't appear to
happen.
[ ... ]
> +static int initialize_MTU(struct net_device *ndev)
> +{
> + struct mctp_pcc_ndev *mctp_pcc_ndev;
> + struct mctp_pcc_mailbox *outbox;
> + struct pcc_mbox_chan *pchan;
> + int mctp_pcc_mtu;
> +
> + mctp_pcc_mtu = MCTP_MIN_MTU;
> + mctp_pcc_ndev = netdev_priv(ndev);
> + outbox = &mctp_pcc_ndev->outbox;
> + pchan = pcc_mbox_request_channel(&outbox->client, outbox->index);
> + if (IS_ERR(pchan))
> + return -1;
> +
> + mctp_pcc_mtu = pchan->shmem_size;
> + pcc_mbox_free_channel(pchan);
> +
> + mctp_pcc_mtu = mctp_pcc_mtu - sizeof(struct acpi_pcct_ext_pcc_shared_memory);
What happens if pchan->shmem_size is smaller than 16 bytes?
The function subtracts sizeof(struct acpi_pcct_ext_pcc_shared_memory)
(16 bytes) from pchan->shmem_size (u64) through the local int variable
mctp_pcc_mtu. If shmem_size is smaller than 16, the subtraction produces
a negative int which wraps when assigned to ndev->max_mtu (unsigned int),
resulting in a value near UINT_MAX.
When packets are transmitted, mctp_pcc_tx_prepare() calls:
memcpy_toio(outbox->chan->shmem, skb->data, skb->len)
where skb->len can far exceed the actual mapped region (only shmem_size
bytes were ioremapped), causing writes beyond the mapped region.
> + ndev->mtu = MCTP_MIN_MTU;
> + ndev->max_mtu = mctp_pcc_mtu;
> + ndev->min_mtu = MCTP_MIN_MTU;
> +
> + return 0;
> +}
[ ... ]
--
pw-bot: cr
^ permalink raw reply [flat|nested] 8+ messages in thread
* [net-next v36] mctp pcc: Implement MCTP over PCC Transport
@ 2026-03-30 17:54 Adam Young
2026-04-02 2:16 ` Jeremy Kerr
0 siblings, 1 reply; 8+ messages in thread
From: Adam Young @ 2026-03-30 17:54 UTC (permalink / raw)
To: Jeremy Kerr, Matt Johnston, Andrew Lunn, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: netdev, linux-kernel, Sudeep Holla, Jonathan Cameron, Huisong Li
Implementation of network driver for
Management Component Transport Protocol(MCTP)
over Platform Communication Channel(PCC)
DMTF DSP:0292
Link: https://www.dmtf.org/sites/default/files/standards/documents/DSP0292_1.0.0WIP50.pdf
The transport mechanism is called Platform Communication Channels (PCC)
is part of the ACPI spec:
Link: https://uefi.org/htmlspecs/ACPI_Spec_6_4_html/14_Platform_Communications_Channel/Platform_Comm_Channel.html
The PCC mechanism is managed via a mailbox implemented at
drivers/mailbox/pcc.c
MCTP devices are specified via ACPI by entries in DSDT/SSDT and
reference channels specified in the PCCT. Messages are sent on a type
3 and received on a type 4 channel. Communication with other devices
use the PCC based doorbell mechanism; a shared memory segment with a
corresponding interrupt and a memory register used to trigger remote
interrupts.
The shared buffer must be at least 68 bytes long as that is the minimum
MTU as defined by the MCTP specification.
Unlike the existing PCC Type 2 based drivers, the mssg parameter to
mbox_send_msg is actively used. The data section of the struct sk_buff
that contains the outgoing packet is sent to the mailbox, already
properly formatted as a PCC exctended message.
If the mailbox ring buffer is full, the driver stops the incoming
packet queues until a message has been sent, freeing space in the
ring buffer.
When the Type 3 channel outbox receives a txdone response interrupt,
it consumes the outgoing sk_buff, allowing it to be freed.
Bringing up an interface creates the channel between the network driver
and the mailbox driver. This enables communication with the remote
endpoint, to include the receipt of new messages. Bringing down an
interface removes the channel, and no new messages can be delivered.
Stopping the interface will leave any packets that are cached in the
mailbox ringbuffer. They cannot safely be freed until the PCC mailbox
attempts to deliver them and has removed them from the ring buffer.
PCC is based on a shared buffer and a set of I/O mapped memory locations
that the Spec calls registers. This mechanism exists regardless of the
existence of the driver. If the user has the ability to map these
physical location to virtual locations, they have the ability to drive the
hardware. Thus, there is a security aspect to this mechanism that extends
beyond the responsibilities of the operating system.
If the hardware does not expose the PCC in the ACPI table, this device
will never be enabled. Thus it is only an issue on hardware that does
support PCC. In that case, it is up to the remote controller to sanitize
communication; MCTP will be exposed as a socket interface, and userland
can send any crafted packet it wants. It would also be incumbent on
the hardware manufacturer to allow the end user to disable MCTP over PCC
communication if they did not want to expose it.
Signed-off-by: Adam Young <admiyo@os.amperecomputing.com>
---
MAINTAINERS | 5 +
drivers/net/mctp/Kconfig | 14 ++
drivers/net/mctp/Makefile | 1 +
drivers/net/mctp/mctp-pcc.c | 345 ++++++++++++++++++++++++++++++++++++
4 files changed, 365 insertions(+)
create mode 100644 drivers/net/mctp/mctp-pcc.c
diff --git a/MAINTAINERS b/MAINTAINERS
index 7a2ffd9d37d5..1813a2126cc0 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -15288,6 +15288,11 @@ F: include/net/mctpdevice.h
F: include/net/netns/mctp.h
F: net/mctp/
+MANAGEMENT COMPONENT TRANSPORT PROTOCOL (MCTP) over PCC (MCTP-PCC) Driver
+M: Adam Young <admiyo@os.amperecomputing.com>
+S: Maintained
+F: drivers/net/mctp/mctp-pcc.c
+
MAPLE TREE
M: Liam R. Howlett <Liam.Howlett@oracle.com>
R: Alice Ryhl <aliceryhl@google.com>
diff --git a/drivers/net/mctp/Kconfig b/drivers/net/mctp/Kconfig
index cf325ab0b1ef..4503152c70bc 100644
--- a/drivers/net/mctp/Kconfig
+++ b/drivers/net/mctp/Kconfig
@@ -47,6 +47,20 @@ config MCTP_TRANSPORT_I3C
A MCTP protocol network device is created for each I3C bus
having a "mctp-controller" devicetree property.
+config MCTP_TRANSPORT_PCC
+ tristate "MCTP PCC transport"
+ depends on ACPI
+ depends on PCC
+ help
+ Provides a driver to access MCTP devices over PCC transport,
+ A MCTP protocol network device is created via ACPI for each
+ entry in the DSDT/SSDT that matches the identifier. The Platform
+ communication channels are selected from the corresponding
+ entries in the PCCT.
+
+ Say y here if you need to connect to MCTP endpoints over PCC. To
+ compile as a module, use m; the module will be called mctp-pcc.
+
config MCTP_TRANSPORT_USB
tristate "MCTP USB transport"
depends on USB
diff --git a/drivers/net/mctp/Makefile b/drivers/net/mctp/Makefile
index c36006849a1e..0a591299ffa9 100644
--- a/drivers/net/mctp/Makefile
+++ b/drivers/net/mctp/Makefile
@@ -1,4 +1,5 @@
obj-$(CONFIG_MCTP_SERIAL) += mctp-serial.o
obj-$(CONFIG_MCTP_TRANSPORT_I2C) += mctp-i2c.o
obj-$(CONFIG_MCTP_TRANSPORT_I3C) += mctp-i3c.o
+obj-$(CONFIG_MCTP_TRANSPORT_PCC) += mctp-pcc.o
obj-$(CONFIG_MCTP_TRANSPORT_USB) += mctp-usb.o
diff --git a/drivers/net/mctp/mctp-pcc.c b/drivers/net/mctp/mctp-pcc.c
new file mode 100644
index 000000000000..700da39aa406
--- /dev/null
+++ b/drivers/net/mctp/mctp-pcc.c
@@ -0,0 +1,345 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * mctp-pcc.c - Driver for MCTP over PCC.
+ * Copyright (c) 2024-2026, Ampere Computing LLC
+ *
+ */
+
+/* Implementation of MCTP over PCC DMTF Specification DSP0256
+ * https://www.dmtf.org/sites/default/files/standards/documents/DSP0292_1.0.0WIP50.pdf
+ */
+
+#include <linux/acpi.h>
+#include <linux/hrtimer.h>
+#include <linux/if_arp.h>
+#include <linux/init.h>
+#include <linux/kernel.h>
+#include <linux/mailbox_client.h>
+#include <linux/module.h>
+#include <linux/netdevice.h>
+#include <linux/platform_device.h>
+#include <linux/skbuff.h>
+#include <linux/string.h>
+
+#include <acpi/acpi_bus.h>
+#include <acpi/acpi_drivers.h>
+#include <acpi/acrestyp.h>
+#include <acpi/actbl.h>
+#include <acpi/pcc.h>
+#include <net/mctp.h>
+#include <net/mctpdevice.h>
+
+#define MCTP_SIGNATURE "MCTP"
+#define MCTP_SIGNATURE_LENGTH (sizeof(MCTP_SIGNATURE) - 1)
+#define MCTP_MIN_MTU 68
+#define PCC_DWORD_TYPE 0x0c
+
+struct mctp_pcc_mailbox {
+ u32 index;
+ struct pcc_mbox_chan *chan;
+ struct mbox_client client;
+};
+
+/* The netdev structure. One of these per PCC adapter. */
+struct mctp_pcc_ndev {
+ struct net_device *ndev;
+ struct acpi_device *acpi_device;
+ struct mctp_pcc_mailbox inbox;
+ struct mctp_pcc_mailbox outbox;
+};
+
+static void mctp_pcc_client_rx_callback(struct mbox_client *cl, void *mssg)
+{
+ struct acpi_pcct_ext_pcc_shared_memory pcc_header;
+ struct mctp_pcc_ndev *mctp_pcc_ndev;
+ struct mctp_pcc_mailbox *inbox;
+ struct mctp_skb_cb *cb;
+ struct sk_buff *skb;
+ int size;
+
+ mctp_pcc_ndev = container_of(cl, struct mctp_pcc_ndev, inbox.client);
+ inbox = &mctp_pcc_ndev->inbox;
+ memcpy_fromio(&pcc_header, inbox->chan->shmem, sizeof(pcc_header));
+
+ // The message must at least have the PCC command indicating it is an MCTP
+ // message followed by the MCTP header, or we have a malformed message.
+ if (pcc_header.length < sizeof(pcc_header.command) + sizeof(struct mctp_hdr)) {
+ dev_dstats_rx_dropped(mctp_pcc_ndev->ndev);
+ return;
+ }
+
+ size = pcc_header.length - sizeof(pcc_header.command) + sizeof(pcc_header);
+ // If the reported size is larger than the shared memory, something is wrong
+ // and the best we can do is treat it as corrupted data.
+ if (size > inbox->chan->shmem_size) {
+ dev_dstats_rx_dropped(mctp_pcc_ndev->ndev);
+ return;
+ }
+
+ if (memcmp(&pcc_header.command, MCTP_SIGNATURE, MCTP_SIGNATURE_LENGTH) != 0) {
+ dev_dstats_rx_dropped(mctp_pcc_ndev->ndev);
+ return;
+ }
+
+ skb = netdev_alloc_skb(mctp_pcc_ndev->ndev, size);
+ if (!skb) {
+ dev_dstats_rx_dropped(mctp_pcc_ndev->ndev);
+ return;
+ }
+ skb_put(skb, size);
+ skb->protocol = htons(ETH_P_MCTP);
+ memcpy_fromio(skb->data, inbox->chan->shmem, size);
+ dev_dstats_rx_add(mctp_pcc_ndev->ndev, size);
+ skb_pull(skb, sizeof(pcc_header));
+ skb_reset_mac_header(skb);
+ skb_reset_network_header(skb);
+ cb = __mctp_cb(skb);
+ cb->halen = 0;
+ netif_rx(skb);
+}
+
+static netdev_tx_t mctp_pcc_tx(struct sk_buff *skb, struct net_device *ndev)
+{
+ struct acpi_pcct_ext_pcc_shared_memory *pcc_header;
+ struct mctp_pcc_ndev *mpnd = netdev_priv(ndev);
+ int len = skb->len;
+ int rc;
+
+ rc = skb_cow_head(skb, sizeof(*pcc_header));
+ if (rc) {
+ dev_dstats_tx_dropped(ndev);
+ kfree_skb(skb);
+ return NETDEV_TX_OK;
+ }
+
+ pcc_header = skb_push(skb, sizeof(*pcc_header));
+ pcc_header->signature = PCC_SIGNATURE | mpnd->outbox.index;
+ pcc_header->flags = PCC_CMD_COMPLETION_NOTIFY;
+ memcpy(&pcc_header->command, MCTP_SIGNATURE, MCTP_SIGNATURE_LENGTH);
+ pcc_header->length = len + MCTP_SIGNATURE_LENGTH;
+
+ rc = mbox_send_message(mpnd->outbox.chan->mchan, skb);
+ if (rc < 0) {
+ //Remove the header in case it gets sent again
+ skb_pull(skb, sizeof(*pcc_header));
+ netif_stop_queue(ndev);
+ return NETDEV_TX_BUSY;
+ }
+
+ return NETDEV_TX_OK;
+}
+
+static void mctp_pcc_tx_prepare(struct mbox_client *cl, void *mssg)
+{
+ struct mctp_pcc_ndev *mctp_pcc_ndev;
+ struct mctp_pcc_mailbox *outbox;
+ struct sk_buff *skb = mssg;
+
+ mctp_pcc_ndev = container_of(cl, struct mctp_pcc_ndev, outbox.client);
+ outbox = &mctp_pcc_ndev->outbox;
+
+ if (!skb)
+ return;
+
+ if (skb->len > outbox->chan->shmem_size) {
+ dev_dstats_tx_dropped(mctp_pcc_ndev->ndev);
+ return;
+ }
+ memcpy_toio(outbox->chan->shmem, skb->data, skb->len);
+}
+
+static void mctp_pcc_tx_done(struct mbox_client *c, void *mssg, int r)
+{
+ struct mctp_pcc_ndev *mctp_pcc_ndev;
+ struct sk_buff *skb = mssg;
+
+ mctp_pcc_ndev = container_of(c, struct mctp_pcc_ndev, outbox.client);
+ dev_dstats_tx_add(mctp_pcc_ndev->ndev, skb->len);
+ dev_consume_skb_any(skb);
+ netif_wake_queue(mctp_pcc_ndev->ndev);
+}
+
+static int mctp_pcc_ndo_open(struct net_device *ndev)
+{
+ struct mctp_pcc_ndev *mctp_pcc_ndev = netdev_priv(ndev);
+ struct mctp_pcc_mailbox *outbox, *inbox;
+
+ outbox = &mctp_pcc_ndev->outbox;
+ inbox = &mctp_pcc_ndev->inbox;
+
+ outbox->chan = pcc_mbox_request_channel(&outbox->client, outbox->index);
+ if (IS_ERR(outbox->chan))
+ return PTR_ERR(outbox->chan);
+
+ inbox->client.rx_callback = mctp_pcc_client_rx_callback;
+ inbox->chan = pcc_mbox_request_channel(&inbox->client, inbox->index);
+ if (IS_ERR(inbox->chan)) {
+ pcc_mbox_free_channel(outbox->chan);
+ return PTR_ERR(inbox->chan);
+ }
+ return 0;
+}
+
+static int mctp_pcc_ndo_stop(struct net_device *ndev)
+{
+ struct mctp_pcc_ndev *mctp_pcc_ndev = netdev_priv(ndev);
+
+ pcc_mbox_free_channel(mctp_pcc_ndev->outbox.chan);
+ pcc_mbox_free_channel(mctp_pcc_ndev->inbox.chan);
+ return 0;
+}
+
+static const struct net_device_ops mctp_pcc_netdev_ops = {
+ .ndo_open = mctp_pcc_ndo_open,
+ .ndo_stop = mctp_pcc_ndo_stop,
+ .ndo_start_xmit = mctp_pcc_tx,
+};
+
+static void mctp_pcc_setup(struct net_device *ndev)
+{
+ ndev->type = ARPHRD_MCTP;
+ ndev->hard_header_len = sizeof(struct acpi_pcct_ext_pcc_shared_memory);
+ ndev->tx_queue_len = 0;
+ ndev->flags = IFF_NOARP;
+ ndev->netdev_ops = &mctp_pcc_netdev_ops;
+ ndev->needs_free_netdev = true;
+ ndev->pcpu_stat_type = NETDEV_PCPU_STAT_DSTATS;
+}
+
+struct mctp_pcc_lookup_context {
+ int index;
+ u32 inbox_index;
+ u32 outbox_index;
+};
+
+static acpi_status lookup_pcct_indices(struct acpi_resource *ares,
+ void *context)
+{
+ struct mctp_pcc_lookup_context *luc = context;
+ struct acpi_resource_address32 *addr;
+
+ if (ares->type != PCC_DWORD_TYPE)
+ return AE_OK;
+
+ addr = ACPI_CAST_PTR(struct acpi_resource_address32, &ares->data);
+ switch (luc->index) {
+ case 0:
+ luc->outbox_index = addr[0].address.minimum;
+ break;
+ case 1:
+ luc->inbox_index = addr[0].address.minimum;
+ break;
+ }
+ luc->index++;
+ return AE_OK;
+}
+
+static void mctp_cleanup_netdev(void *data)
+{
+ struct net_device *ndev = data;
+
+ mctp_unregister_netdev(ndev);
+}
+
+static int initialize_MTU(struct net_device *ndev)
+{
+ struct mctp_pcc_ndev *mctp_pcc_ndev;
+ struct mctp_pcc_mailbox *outbox;
+ struct pcc_mbox_chan *pchan;
+ int mctp_pcc_mtu;
+
+ mctp_pcc_mtu = MCTP_MIN_MTU;
+ mctp_pcc_ndev = netdev_priv(ndev);
+ outbox = &mctp_pcc_ndev->outbox;
+ pchan = pcc_mbox_request_channel(&outbox->client, outbox->index);
+ if (IS_ERR(pchan))
+ return -1;
+ if (pchan->shmem_size < MCTP_MIN_MTU)
+ return -1;
+
+ mctp_pcc_mtu = pchan->shmem_size;
+ pcc_mbox_free_channel(pchan);
+
+ mctp_pcc_mtu = mctp_pcc_mtu - sizeof(struct acpi_pcct_ext_pcc_shared_memory);
+ ndev->mtu = MCTP_MIN_MTU;
+ ndev->max_mtu = mctp_pcc_mtu;
+ ndev->min_mtu = MCTP_MIN_MTU;
+
+ return 0;
+}
+
+static int mctp_pcc_driver_add(struct acpi_device *acpi_dev)
+{
+ struct mctp_pcc_lookup_context context = {0};
+ struct mctp_pcc_ndev *mctp_pcc_ndev;
+ struct device *dev = &acpi_dev->dev;
+ struct net_device *ndev;
+ acpi_handle dev_handle;
+ acpi_status status;
+ char name[32];
+ int rc;
+
+ dev_dbg(dev, "Adding mctp_pcc device for HID %s\n",
+ acpi_device_hid(acpi_dev));
+ dev_handle = acpi_device_handle(acpi_dev);
+ status = acpi_walk_resources(dev_handle, "_CRS", lookup_pcct_indices,
+ &context);
+ if (!ACPI_SUCCESS(status)) {
+ dev_err(dev, "FAILED to lookup PCC indexes from CRS\n");
+ return -EINVAL;
+ }
+
+ snprintf(name, sizeof(name), "mctppcc%d", context.inbox_index);
+ ndev = alloc_netdev(sizeof(*mctp_pcc_ndev), name, NET_NAME_PREDICTABLE,
+ mctp_pcc_setup);
+ if (!ndev)
+ return -ENOMEM;
+
+ mctp_pcc_ndev = netdev_priv(ndev);
+
+ mctp_pcc_ndev->inbox.index = context.inbox_index;
+ mctp_pcc_ndev->inbox.client.dev = dev;
+ mctp_pcc_ndev->outbox.index = context.outbox_index;
+ mctp_pcc_ndev->outbox.client.dev = dev;
+
+ mctp_pcc_ndev->outbox.client.tx_prepare = mctp_pcc_tx_prepare;
+ mctp_pcc_ndev->outbox.client.tx_done = mctp_pcc_tx_done;
+ mctp_pcc_ndev->acpi_device = acpi_dev;
+ mctp_pcc_ndev->ndev = ndev;
+ acpi_dev->driver_data = mctp_pcc_ndev;
+
+ rc = initialize_MTU(ndev);
+ if (rc)
+ goto free_netdev;
+
+ rc = mctp_register_netdev(ndev, NULL, MCTP_PHYS_BINDING_PCC);
+ if (rc)
+ goto free_netdev;
+
+ return devm_add_action_or_reset(dev, mctp_cleanup_netdev, ndev);
+free_netdev:
+ free_netdev(ndev);
+ return rc;
+}
+
+static const struct acpi_device_id mctp_pcc_device_ids[] = {
+ { "DMT0001" },
+ {}
+};
+
+static struct acpi_driver mctp_pcc_driver = {
+ .name = "mctp_pcc",
+ .class = "Unknown",
+ .ids = mctp_pcc_device_ids,
+ .ops = {
+ .add = mctp_pcc_driver_add,
+ },
+};
+
+module_acpi_driver(mctp_pcc_driver);
+
+MODULE_DEVICE_TABLE(acpi, mctp_pcc_device_ids);
+
+MODULE_DESCRIPTION("MCTP PCC ACPI device");
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Adam Young <admiyo@os.amperecomputing.com>");
--
2.43.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [net-next v36] mctp pcc: Implement MCTP over PCC Transport
2026-03-30 17:54 [net-next v36] mctp pcc: Implement MCTP over PCC Transport Adam Young
@ 2026-04-02 2:16 ` Jeremy Kerr
2026-04-02 15:26 ` Adam Young
0 siblings, 1 reply; 8+ messages in thread
From: Jeremy Kerr @ 2026-04-02 2:16 UTC (permalink / raw)
To: Adam Young, Matt Johnston, Andrew Lunn, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: netdev, linux-kernel, Sudeep Holla, Jonathan Cameron, Huisong Li
Hi Adam,
> Implementation of network driver for
> Management Component Transport Protocol(MCTP)
> over Platform Communication Channel(PCC)
>
> DMTF DSP:0292
> Link: https://www.dmtf.org/sites/default/files/standards/documents/DSP0292_1.0.0WIP50.pdf
>
> The transport mechanism is called Platform Communication Channels (PCC)
> is part of the ACPI spec:
>
> Link: https://uefi.org/htmlspecs/ACPI_Spec_6_4_html/14_Platform_Communications_Channel/Platform_Comm_Channel.html
>
> The PCC mechanism is managed via a mailbox implemented at
> drivers/mailbox/pcc.c
>
> MCTP devices are specified via ACPI by entries in DSDT/SSDT and
> reference channels specified in the PCCT. Messages are sent on a type
> 3 and received on a type 4 channel. Communication with other devices
> use the PCC based doorbell mechanism; a shared memory segment with a
> corresponding interrupt and a memory register used to trigger remote
> interrupts.
>
> The shared buffer must be at least 68 bytes long as that is the minimum
> MTU as defined by the MCTP specification.
>
> Unlike the existing PCC Type 2 based drivers, the mssg parameter to
> mbox_send_msg is actively used. The data section of the struct sk_buff
> that contains the outgoing packet is sent to the mailbox, already
> properly formatted as a PCC exctended message.
>
> If the mailbox ring buffer is full, the driver stops the incoming
> packet queues until a message has been sent, freeing space in the
> ring buffer.
>
> When the Type 3 channel outbox receives a txdone response interrupt,
> it consumes the outgoing sk_buff, allowing it to be freed.
>
> Bringing up an interface creates the channel between the network driver
> and the mailbox driver. This enables communication with the remote
> endpoint, to include the receipt of new messages. Bringing down an
> interface removes the channel, and no new messages can be delivered.
> Stopping the interface will leave any packets that are cached in the
> mailbox ringbuffer. They cannot safely be freed until the PCC mailbox
> attempts to deliver them and has removed them from the ring buffer.
>
> PCC is based on a shared buffer and a set of I/O mapped memory locations
> that the Spec calls registers. This mechanism exists regardless of the
> existence of the driver. If the user has the ability to map these
> physical location to virtual locations, they have the ability to drive the
> hardware. Thus, there is a security aspect to this mechanism that extends
> beyond the responsibilities of the operating system.
>
> If the hardware does not expose the PCC in the ACPI table, this device
> will never be enabled. Thus it is only an issue on hardware that does
> support PCC. In that case, it is up to the remote controller to sanitize
> communication; MCTP will be exposed as a socket interface, and userland
> can send any crafted packet it wants. It would also be incumbent on
> the hardware manufacturer to allow the end user to disable MCTP over PCC
> communication if they did not want to expose it.
>
> Signed-off-by: Adam Young <admiyo@os.amperecomputing.com>
No changelogs anymore? This makes things more difficult to review. Just
to be clear, when I mentioned changing to a single-patch series, that
did not mean to throw out the changelog - it's still a key part of
review.
So, for v36, it looks like you have
* added the CONFIG_PCC dependency
* altered the RX min length check to include at least the command and
a MCTP header
* added a RX max-length check
* added a minimum MTU check
* altered the commit message to reflect PCC mailbox free behaviour.
On the latter, could you expand on what happens on close? Does the PCC
channel end up calling tx_done() on each pending TX during the channel
free? I'm not familiar with the PCC paths, but it doesn't look like it
(or the mailbox core) has a case to deal with this on free.
Maybe I am missing something, but could this leak skbs?
[...]
> +static int initialize_MTU(struct net_device *ndev)
> +{
> + struct mctp_pcc_ndev *mctp_pcc_ndev;
> + struct mctp_pcc_mailbox *outbox;
> + struct pcc_mbox_chan *pchan;
> + int mctp_pcc_mtu;
> +
> + mctp_pcc_mtu = MCTP_MIN_MTU;
> + mctp_pcc_ndev = netdev_priv(ndev);
> + outbox = &mctp_pcc_ndev->outbox;
> + pchan = pcc_mbox_request_channel(&outbox->client, outbox->index);
> + if (IS_ERR(pchan))
> + return -1;
> + if (pchan->shmem_size < MCTP_MIN_MTU)
> + return -1;
Looks like this would leak the channel? You may want to shift this check
to after the free_channel().
(but also, should this check also include the size of the pcc header?)
Cheers,
Jeremy
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [net-next v36] mctp pcc: Implement MCTP over PCC Transport
2026-04-02 2:16 ` Jeremy Kerr
@ 2026-04-02 15:26 ` Adam Young
2026-04-03 1:30 ` Jeremy Kerr
0 siblings, 1 reply; 8+ messages in thread
From: Adam Young @ 2026-04-02 15:26 UTC (permalink / raw)
To: Jeremy Kerr, Adam Young, Matt Johnston, Andrew Lunn,
David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: netdev, linux-kernel, Sudeep Holla, Jonathan Cameron, Huisong Li
On 4/1/26 22:16, Jeremy Kerr wrote:
> Hi Adam,
>
>> Implementation of network driver for
>> Management Component Transport Protocol(MCTP)
>> over Platform Communication Channel(PCC)
>>
>> DMTF DSP:0292
>> Link: https://www.dmtf.org/sites/default/files/standards/documents/DSP0292_1.0.0WIP50.pdf
>>
>> The transport mechanism is called Platform Communication Channels (PCC)
>> is part of the ACPI spec:
>>
>> Link: https://uefi.org/htmlspecs/ACPI_Spec_6_4_html/14_Platform_Communications_Channel/Platform_Comm_Channel.html
>>
>> The PCC mechanism is managed via a mailbox implemented at
>> drivers/mailbox/pcc.c
>>
>> MCTP devices are specified via ACPI by entries in DSDT/SSDT and
>> reference channels specified in the PCCT. Messages are sent on a type
>> 3 and received on a type 4 channel. Communication with other devices
>> use the PCC based doorbell mechanism; a shared memory segment with a
>> corresponding interrupt and a memory register used to trigger remote
>> interrupts.
>>
>> The shared buffer must be at least 68 bytes long as that is the minimum
>> MTU as defined by the MCTP specification.
>>
>> Unlike the existing PCC Type 2 based drivers, the mssg parameter to
>> mbox_send_msg is actively used. The data section of the struct sk_buff
>> that contains the outgoing packet is sent to the mailbox, already
>> properly formatted as a PCC exctended message.
>>
>> If the mailbox ring buffer is full, the driver stops the incoming
>> packet queues until a message has been sent, freeing space in the
>> ring buffer.
>>
>> When the Type 3 channel outbox receives a txdone response interrupt,
>> it consumes the outgoing sk_buff, allowing it to be freed.
>>
>> Bringing up an interface creates the channel between the network driver
>> and the mailbox driver. This enables communication with the remote
>> endpoint, to include the receipt of new messages. Bringing down an
>> interface removes the channel, and no new messages can be delivered.
>> Stopping the interface will leave any packets that are cached in the
>> mailbox ringbuffer. They cannot safely be freed until the PCC mailbox
>> attempts to deliver them and has removed them from the ring buffer.
>>
>> PCC is based on a shared buffer and a set of I/O mapped memory locations
>> that the Spec calls registers. This mechanism exists regardless of the
>> existence of the driver. If the user has the ability to map these
>> physical location to virtual locations, they have the ability to drive the
>> hardware. Thus, there is a security aspect to this mechanism that extends
>> beyond the responsibilities of the operating system.
>>
>> If the hardware does not expose the PCC in the ACPI table, this device
>> will never be enabled. Thus it is only an issue on hardware that does
>> support PCC. In that case, it is up to the remote controller to sanitize
>> communication; MCTP will be exposed as a socket interface, and userland
>> can send any crafted packet it wants. It would also be incumbent on
>> the hardware manufacturer to allow the end user to disable MCTP over PCC
>> communication if they did not want to expose it.
>>
>> Signed-off-by: Adam Young <admiyo@os.amperecomputing.com>
> No changelogs anymore? This makes things more difficult to review. Just
> to be clear, when I mentioned changing to a single-patch series, that
> did not mean to throw out the changelog - it's still a key part of
> review.
Sorry about that. Your list below is accurate.
>
> So, for v36, it looks like you have
>
> * added the CONFIG_PCC dependency
Yes. THis was from Jakub's review
> * altered the RX min length check to include at least the command and
> a MCTP header
Yes. Again, from Jakub's review as well as thinking through what the
most minimal acceptable message would be.
> * added a RX max-length check
Correct. If the max-length is greater than the buffer something is
misconfigured.
> * added a minimum MTU check
> * altered the commit message to reflect PCC mailbox free behaviour.
>
> On the latter, could you expand on what happens on close? Does the PCC
> channel end up calling tx_done() on each pending TX during the channel
> free? I'm not familiar with the PCC paths, but it doesn't look like it
> (or the mailbox core) has a case to deal with this on free.
>
> Maybe I am missing something, but could this leak skbs?
Yes, it could, but since they are held by another subsystem, and there
is no way to access them, it is safer to leak than to free. The Ring
Buffer in the Mailbox layer is not accessable from the MCTP Client.
Additionally, there is no way to force a flush of ring buffer. This
should be a vanishingly rare case, I suspect where someone would have to
deliberately send a bunch of messages and then immediately bring the
link down. Bringing the link back up would immediately send the
remaining skbs and cause them to be free, so they are not frozen in
perpetuity with no chance of being sent.
Whether the backend could handle this is a different story.
There is a potential for this kind of leak even if we were to transfer
the data out of SKB: the two options are to either leave it in the ring
buffer or risk a use-after-free event.
>
> [...]
>
>> +static int initialize_MTU(struct net_device *ndev)
>> +{
>> + struct mctp_pcc_ndev *mctp_pcc_ndev;
>> + struct mctp_pcc_mailbox *outbox;
>> + struct pcc_mbox_chan *pchan;
>> + int mctp_pcc_mtu;
>> +
>> + mctp_pcc_mtu = MCTP_MIN_MTU;
>> + mctp_pcc_ndev = netdev_priv(ndev);
>> + outbox = &mctp_pcc_ndev->outbox;
>> + pchan = pcc_mbox_request_channel(&outbox->client, outbox->index);
>> + if (IS_ERR(pchan))
>> + return -1;
>> + if (pchan->shmem_size < MCTP_MIN_MTU)
>> + return -1;
> Looks like this would leak the channel? You may want to shift this check
> to after the free_channel().
True.
>
> (but also, should this check also include the size of the pcc header?)
Yes, good catch.
>
> Cheers,
>
>
> Jeremy
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [net-next v36] mctp pcc: Implement MCTP over PCC Transport
2026-04-02 15:26 ` Adam Young
@ 2026-04-03 1:30 ` Jeremy Kerr
2026-04-04 5:10 ` Adam Young
0 siblings, 1 reply; 8+ messages in thread
From: Jeremy Kerr @ 2026-04-03 1:30 UTC (permalink / raw)
To: Adam Young, Adam Young, Matt Johnston, Andrew Lunn,
David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: netdev, linux-kernel, Sudeep Holla, Jonathan Cameron, Huisong Li
Hi Adam,
> > On the latter, could you expand on what happens on close? Does the PCC
> > channel end up calling tx_done() on each pending TX during the channel
> > free? I'm not familiar with the PCC paths, but it doesn't look like it
> > (or the mailbox core) has a case to deal with this on free.
> >
> > Maybe I am missing something, but could this leak skbs?
>
> Yes, it could, but since they are held by another subsystem, and there
> is no way to access them, it is safer to leak than to free. The Ring
> Buffer in the Mailbox layer is not accessable from the MCTP Client.
> Additionally, there is no way to force a flush of ring buffer.
Sure, but it looks like the messages are essentially lost on
free; a re-bind will clear the mailbox channel's msg buf.
Without a mechanism to purge the message queue on free, it looks like
the only leak-less way to handle this is to keep track of the pending
skbs manually.
> There is a potential for this kind of leak even if we were to transfer
> the data out of SKB: the two options are to either leave it in the ring
> buffer or risk a use-after-free event.
Neither of those are good.
Where is the use-after-free here? Once the pcc_mbox_free_channel()
returns, the pending message buf seems to be forgotten, and so I can't
see how any pending skb gets referenced.
> Bringing the link back up would immediately send the
> remaining skbs and cause them to be free, so they are not frozen in
> perpetuity with no chance of being sent.
Same here - the mbox message queue looks to be reset on client bind, so
it appears that they wouldn't get sent?
If you do need to track the skbs yourself, that could be fairly simple:
keep a sk_buff_head of pending skbs, remove from the list on tx
completion, and skb_queue_purge on ndo_close.
> Whether the backend could handle this is a different story.
A cancellation callback would be handy, yeah.
Cheers,
Jeremy
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [net-next v36] mctp pcc: Implement MCTP over PCC Transport
2026-04-03 1:30 ` Jeremy Kerr
@ 2026-04-04 5:10 ` Adam Young
0 siblings, 0 replies; 8+ messages in thread
From: Adam Young @ 2026-04-04 5:10 UTC (permalink / raw)
To: Jeremy Kerr, Adam Young, Matt Johnston, Andrew Lunn,
David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: netdev, linux-kernel, Sudeep Holla, Jonathan Cameron, Huisong Li
On 4/2/26 21:30, Jeremy Kerr wrote:
> Hi Adam,
>
>>> On the latter, could you expand on what happens on close? Does the PCC
>>> channel end up calling tx_done() on each pending TX during the channel
>>> free? I'm not familiar with the PCC paths, but it doesn't look like it
>>> (or the mailbox core) has a case to deal with this on free.
>>>
>>> Maybe I am missing something, but could this leak skbs?
>> Yes, it could, but since they are held by another subsystem, and there
>> is no way to access them, it is safer to leak than to free. The Ring
>> Buffer in the Mailbox layer is not accessable from the MCTP Client.
>> Additionally, there is no way to force a flush of ring buffer.
> Sure, but it looks like the messages are essentially lost on
> free; a re-bind will clear the mailbox channel's msg buf.
>
> Without a mechanism to purge the message queue on free, it looks like
> the only leak-less way to handle this is to keep track of the pending
> skbs manually.
>
>> There is a potential for this kind of leak even if we were to transfer
>> the data out of SKB: the two options are to either leave it in the ring
>> buffer or risk a use-after-free event.
> Neither of those are good.
>
> Where is the use-after-free here? Once the pcc_mbox_free_channel()
> returns, the pending message buf seems to be forgotten, and so I can't
> see how any pending skb gets referenced.
>
>> Bringing the link back up would immediately send the
>> remaining skbs and cause them to be free, so they are not frozen in
>> perpetuity with no chance of being sent.
> Same here - the mbox message queue looks to be reset on client bind, so
> it appears that they wouldn't get sent?
>
> If you do need to track the skbs yourself, that could be fairly simple:
> keep a sk_buff_head of pending skbs, remove from the list on tx
> completion, and skb_queue_purge on ndo_close.
OK, I can manually traverse the ring buffer and free the skbuffs. The
ring buffer and associated fields are in the mbox_channel, which I have
access to.
>
>> Whether the backend could handle this is a different story.
> A cancellation callback would be handy, yeah.
>
> Cheers,
>
>
> Jeremy
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2026-04-04 5:10 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-30 17:54 [net-next v36] mctp pcc: Implement MCTP over PCC Transport Adam Young
2026-04-02 2:16 ` Jeremy Kerr
2026-04-02 15:26 ` Adam Young
2026-04-03 1:30 ` Jeremy Kerr
2026-04-04 5:10 ` Adam Young
-- strict thread matches above, loose matches on Subject: below --
2026-03-25 19:42 Adam Young
2026-03-26 22:29 ` Jakub Kicinski
2026-03-29 21:15 ` Jakub Kicinski
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox