* [net-next v39] mctp pcc: Implement MCTP over PCC Transport
@ 2026-05-06 21:43 Adam Young
2026-05-07 3:02 ` Jeremy Kerr
0 siblings, 1 reply; 4+ messages in thread
From: Adam Young @ 2026-05-06 21:43 UTC (permalink / raw)
To: Jeremy Kerr, Matt Johnston, Andrew Lunn, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: netdev, linux-kernel, Sudeep Holla, Jonathan Cameron, Huisong Li
Implementation of network driver for
Management Component Transport Protocol(MCTP)
over Platform Communication Channel(PCC)
DMTF DSP:0292
Link: https://www.dmtf.org/sites/default/files/standards/documents/DSP0292_1.0.0WIP50.pdf
The transport mechanism is called Platform Communication Channels (PCC)
is part of the ACPI spec:
Link: https://uefi.org/htmlspecs/ACPI_Spec_6_4_html/14_Platform_Communications_Channel/Platform_Comm_Channel.html
The PCC mechanism is managed via a mailbox implemented at
drivers/mailbox/pcc.c
MCTP devices are specified via ACPI by entries in DSDT/SSDT and
reference channels specified in the PCCT. Messages are sent on a type
3 and received on a type 4 channel. Communication with other devices
use the PCC based doorbell mechanism; a shared memory segment with a
corresponding interrupt and a memory register used to trigger remote
interrupts.
The shared buffer must be at least 68 bytes long as that is the minimum
MTU as defined by the MCTP specification.
Unlike the existing PCC Type 2 based drivers, the mssg parameter to
mbox_send_msg is actively used. The data section of the struct sk_buff
that contains the outgoing packet is sent to the mailbox, already
properly formatted as a PCC exctended message.
If the mailbox ring buffer is full, the driver stops the incoming
packet queues until a message has been sent, freeing space in the
ring buffer.
When the Type 3 channel outbox receives a txdone response interrupt,
it consumes the outgoing sk_buff, allowing it to be freed.
Bringing up an interface creates the channel between the network driver
and the mailbox driver. This enables communication with the remote
endpoint, to include the receipt of new messages. Bringing down an
interface removes the channel, and no new messages can be delivered.
Stopping the interface will leave any packets that are cached in the
mailbox ringbuffer. They cannot safely be freed until the PCC mailbox
attempts to deliver them and has removed them from the ring buffer.
PCC is based on a shared buffer and a set of I/O mapped memory locations
that the Spec calls registers. This mechanism exists regardless of the
existence of the driver. If the user has the ability to map these
physical location to virtual locations, they have the ability to drive the
hardware. Thus, there is a security aspect to this mechanism that extends
beyond the responsibilities of the operating system.
If the hardware does not expose the PCC in the ACPI table, this device
will never be enabled. Thus it is only an issue on hardware that does
support PCC. In that case, it is up to the remote controller to sanitize
communication; MCTP will be exposed as a socket interface, and userland
can send any crafted packet it wants. It would also be incumbent on
the hardware manufacturer to allow the end user to disable MCTP over PCC
communication if they did not want to expose it.
Link: https://www.dmtf.org/sites/default/files/standards/documents/DSP0292_1.0.0WIP50.pdf
Link: https://uefi.org/htmlspecs/ACPI_Spec_6_4_html/14_Platform_Communications_Channel/Platform_Comm_Channel.html
Signed-off-by: Adam Young <admiyo@os.amperecomputing.com>
---
Several rounds of code review have gone into this update. The changes
are listed roughly in the order they were made. Additional issues have
been found but require changes in the mailbox/pcc.c and the
corresponding header file. These will be submitted in a separate series
of reviews. The driver works without those changes, and any issues are
pre-existing in all PCC extended memory drivers.
Previous Version: https://lore.kernel.org/lkml/20260405180741.1496198-1-admiyo@os.amperecomputing.com/
Changes from Previous version
check that returned message length is smaller than the size of the shared buffer.
remove space after // in mctp_pcc_tx
don't say ndo in function names
don't use capital letter in MTU function name
move dev_dstats_tx_add outside of irq processing
Note AI Generated Code Review comment which seems legit:
mctp_pcc_tx() runs in process or softirq context and calls
dev_dstats_tx_dropped(). mctp_pcc_tx_done() runs in the controller's hardirq
context and calls dev_dstats_tx_add().
The generic dev_dstats_* macros use u64_stats_update_begin(), which does not
disable interrupts. If a hardware interrupt fires while the softirq path is
inside u64_stats_update_begin(), the hardirq path will re-enter the lock.
while technically there are still possible ways that the SKB send might fail
we cannot report them any later than prep_tx. In tx complete,
we are in interupt context and it is not safe to update the stats then.
mctp-pcc use little-endian conversion for shared buffer header values
explain why we test for a null SKB in mctp_pcc_tx_prepare and warn once
Deal with fragmented outbound packets
Cleanup the active message when tearing down the driver
the active message may have been sent to the backend just prior to clean up
we want to free it during cleanup, and thus make sure that the null pointer
set as the active message is not passed to the callback if the message
response interrupt gets triggered.
return PTR_ERR and EINVAL for initialization error cases
Fail initialization if there are not exactly 2 channels: an outbox and an inbox
don't advance msg_free when freeing skbs
use PCC_EXTRA_LEN to clarify and standardize length comparisons between value from buffer and size of shmem.
ensure the shared buffer is large enough to hold MCTP-PCC messages
ensure there is an active message when clearing the ring buffer
start teardown process by disabling rx
Avoid a race condition where the teardown
process frees the active req, but an interupt
resets it, leaking memory. Ensure active_req
stays clear By closing the channel first.
CONFIG option to ensure mctp is 64bit only
compare inbox MTU with Outbox MTU and take the smaller value
add mctp_pcc_change_mtu
use ACPI constant for u32 data type
default case for switch in testing PCC channels
---
MAINTAINERS | 5 +
drivers/net/mctp/Kconfig | 15 ++
drivers/net/mctp/Makefile | 1 +
drivers/net/mctp/mctp-pcc.c | 427 ++++++++++++++++++++++++++++++++++++
4 files changed, 448 insertions(+)
create mode 100644 drivers/net/mctp/mctp-pcc.c
diff --git a/MAINTAINERS b/MAINTAINERS
index cb0f2bf6fe2e..7a064e326f0a 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -15398,6 +15398,11 @@ F: include/net/mctpdevice.h
F: include/net/netns/mctp.h
F: net/mctp/
+MANAGEMENT COMPONENT TRANSPORT PROTOCOL (MCTP) over PCC (MCTP-PCC) Driver
+M: Adam Young <admiyo@os.amperecomputing.com>
+S: Maintained
+F: drivers/net/mctp/mctp-pcc.c
+
MAPLE TREE
M: Liam R. Howlett <Liam.Howlett@oracle.com>
R: Alice Ryhl <aliceryhl@google.com>
diff --git a/drivers/net/mctp/Kconfig b/drivers/net/mctp/Kconfig
index cf325ab0b1ef..9c1d228e5ad7 100644
--- a/drivers/net/mctp/Kconfig
+++ b/drivers/net/mctp/Kconfig
@@ -47,6 +47,21 @@ config MCTP_TRANSPORT_I3C
A MCTP protocol network device is created for each I3C bus
having a "mctp-controller" devicetree property.
+config MCTP_TRANSPORT_PCC
+ tristate "MCTP PCC transport"
+ depends on ACPI
+ depends on PCC
+ depends on 64BIT
+ help
+ Provides a driver to access MCTP devices over PCC transport,
+ A MCTP protocol network device is created via ACPI for each
+ entry in the DSDT/SSDT that matches the identifier. The Platform
+ communication channels are selected from the corresponding
+ entries in the PCCT.
+
+ Say y here if you need to connect to MCTP endpoints over PCC. To
+ compile as a module, use m; the module will be called mctp-pcc.
+
config MCTP_TRANSPORT_USB
tristate "MCTP USB transport"
depends on USB
diff --git a/drivers/net/mctp/Makefile b/drivers/net/mctp/Makefile
index c36006849a1e..0a591299ffa9 100644
--- a/drivers/net/mctp/Makefile
+++ b/drivers/net/mctp/Makefile
@@ -1,4 +1,5 @@
obj-$(CONFIG_MCTP_SERIAL) += mctp-serial.o
obj-$(CONFIG_MCTP_TRANSPORT_I2C) += mctp-i2c.o
obj-$(CONFIG_MCTP_TRANSPORT_I3C) += mctp-i3c.o
+obj-$(CONFIG_MCTP_TRANSPORT_PCC) += mctp-pcc.o
obj-$(CONFIG_MCTP_TRANSPORT_USB) += mctp-usb.o
diff --git a/drivers/net/mctp/mctp-pcc.c b/drivers/net/mctp/mctp-pcc.c
new file mode 100644
index 000000000000..de644c188f3e
--- /dev/null
+++ b/drivers/net/mctp/mctp-pcc.c
@@ -0,0 +1,427 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * mctp-pcc.c - Driver for MCTP over PCC.
+ * Copyright (c) 2024-2026, Ampere Computing LLC
+ *
+ */
+
+/* Implementation of MCTP over PCC DMTF Specification DSP0256
+ * https://www.dmtf.org/sites/default/files/standards/documents/DSP0292_1.0.0WIP50.pdf
+ */
+
+#include <linux/acpi.h>
+#include <linux/hrtimer.h>
+#include <linux/if_arp.h>
+#include <linux/init.h>
+#include <linux/kernel.h>
+#include <linux/mailbox_client.h>
+#include <linux/module.h>
+#include <linux/netdevice.h>
+#include <linux/platform_device.h>
+#include <linux/skbuff.h>
+#include <linux/string.h>
+
+#include <acpi/acpi_bus.h>
+#include <acpi/acpi_drivers.h>
+#include <acpi/acrestyp.h>
+#include <acpi/actbl.h>
+#include <acpi/pcc.h>
+#include <net/mctp.h>
+#include <net/mctpdevice.h>
+
+#define MCTP_SIGNATURE "MCTP"
+#define MCTP_SIGNATURE_LENGTH (sizeof(MCTP_SIGNATURE) - 1)
+#define MCTP_MIN_MTU 68
+#define PCC_HEADER_SIZE sizeof(struct acpi_pcct_ext_pcc_shared_memory)
+#define MCTP_PCC_MIN_SIZE (PCC_HEADER_SIZE + MCTP_MIN_MTU)
+#define PCC_EXTRA_LEN (PCC_HEADER_SIZE - sizeof(pcc_header.command))
+struct mctp_pcc_mailbox {
+ u32 index;
+ struct pcc_mbox_chan *chan;
+ struct mbox_client client;
+};
+
+/* The netdev structure. One of these per PCC adapter. */
+struct mctp_pcc_ndev {
+ struct net_device *ndev;
+ struct acpi_device *acpi_device;
+ struct mctp_pcc_mailbox inbox;
+ struct mctp_pcc_mailbox outbox;
+};
+
+static void mctp_pcc_client_rx_callback(struct mbox_client *cl, void *mssg)
+{
+ struct acpi_pcct_ext_pcc_shared_memory pcc_header;
+ struct mctp_pcc_ndev *mctp_pcc_ndev;
+ struct mctp_pcc_mailbox *inbox;
+ struct mctp_skb_cb *cb;
+ struct sk_buff *skb;
+ u32 header_length;
+ int size;
+
+ mctp_pcc_ndev = container_of(cl, struct mctp_pcc_ndev, inbox.client);
+ inbox = &mctp_pcc_ndev->inbox;
+ memcpy_fromio(&pcc_header, inbox->chan->shmem, sizeof(pcc_header));
+
+ // The message must at least have the PCC command indicating it is an MCTP
+ // message followed by the MCTP header, or we have a malformed message.
+ // This may be run on big endian system, but the data in the buffer is
+ // explicitly little endian.
+ header_length = le32_to_cpu(pcc_header.length);
+ if (header_length < sizeof(pcc_header.command) + sizeof(struct mctp_hdr)) {
+ dev_dstats_rx_dropped(mctp_pcc_ndev->ndev);
+ return;
+ }
+ // If the reported size is larger than the shared memory minus headers,
+ // something is wrong and treat the buffer as corrupted data.
+ if (header_length > inbox->chan->shmem_size - PCC_EXTRA_LEN) {
+ dev_dstats_rx_dropped(mctp_pcc_ndev->ndev);
+ return;
+ }
+ if (memcmp(&pcc_header.command, MCTP_SIGNATURE, MCTP_SIGNATURE_LENGTH) != 0) {
+ dev_dstats_rx_dropped(mctp_pcc_ndev->ndev);
+ return;
+ }
+
+ size = header_length + PCC_EXTRA_LEN;
+ skb = netdev_alloc_skb(mctp_pcc_ndev->ndev, size);
+ if (!skb) {
+ dev_dstats_rx_dropped(mctp_pcc_ndev->ndev);
+ return;
+ }
+ skb_put(skb, size);
+ skb->protocol = htons(ETH_P_MCTP);
+ memcpy_fromio(skb->data, inbox->chan->shmem, size);
+ dev_dstats_rx_add(mctp_pcc_ndev->ndev, size);
+ skb_pull(skb, sizeof(pcc_header));
+ skb_reset_mac_header(skb);
+ skb_reset_network_header(skb);
+ cb = __mctp_cb(skb);
+ cb->halen = 0;
+ netif_rx(skb);
+}
+
+static netdev_tx_t mctp_pcc_tx(struct sk_buff *skb, struct net_device *ndev)
+{
+ struct acpi_pcct_ext_pcc_shared_memory *pcc_header;
+ struct mctp_pcc_ndev *mpnd = netdev_priv(ndev);
+ int len = skb->len;
+
+ /* Consolidated a fragmented packet into contiguous memory */
+ if (skb_is_nonlinear(skb)) {
+ if (skb_linearize(skb))
+ goto error;
+ }
+
+ if (skb_cow_head(skb, sizeof(*pcc_header)))
+ goto error;
+
+ /**
+ * This code could potentially be run on A Big Endian
+ * System. The ACPI specification requires that values
+ * in the shared buffer be little endian.
+ */
+ pcc_header = skb_push(skb, sizeof(*pcc_header));
+ pcc_header->signature = cpu_to_le32(PCC_SIGNATURE | mpnd->outbox.index);
+ pcc_header->flags = cpu_to_le32(PCC_CMD_COMPLETION_NOTIFY);
+ memcpy(&pcc_header->command, MCTP_SIGNATURE, MCTP_SIGNATURE_LENGTH);
+ pcc_header->length = cpu_to_le32(len + MCTP_SIGNATURE_LENGTH);
+
+ if (mbox_send_message(mpnd->outbox.chan->mchan, skb) < 0) {
+ // Remove the header in case it gets sent again
+ skb_pull(skb, sizeof(*pcc_header));
+ netif_stop_queue(ndev);
+ return NETDEV_TX_BUSY;
+ }
+
+ return NETDEV_TX_OK;
+error:
+ dev_dstats_tx_dropped(ndev);
+ kfree_skb(skb);
+ return NETDEV_TX_OK;
+}
+
+static void mctp_pcc_tx_prepare(struct mbox_client *cl, void *mssg)
+{
+ struct mctp_pcc_ndev *mctp_pcc_ndev;
+ struct mctp_pcc_mailbox *outbox;
+ struct sk_buff *skb = mssg;
+
+ mctp_pcc_ndev = container_of(cl, struct mctp_pcc_ndev, outbox.client);
+ outbox = &mctp_pcc_ndev->outbox;
+
+ /* The PCC Mailbox typically does not make use of the mssg pointer
+ * The mctp-over pcc driver is the only client that uses it.
+ * This value should always be non-null; it is possible
+ * that a change in the Mailbox level will break that assumption.
+ */
+ if (!skb) {
+ WARN_ONCE(!skb, "%s called with null message.\n", __func__);
+ return;
+ }
+
+ if (skb->len > outbox->chan->shmem_size) {
+ dev_dstats_tx_dropped(mctp_pcc_ndev->ndev);
+ return;
+ }
+ memcpy_toio(outbox->chan->shmem, skb->data, skb->len);
+ dev_dstats_tx_add(mctp_pcc_ndev->ndev, skb->len);
+}
+
+static void mctp_pcc_tx_done(struct mbox_client *c, void *mssg, int r)
+{
+ struct mctp_pcc_ndev *mctp_pcc_ndev;
+ struct sk_buff *skb = mssg;
+
+ /*
+ * If there is a packet in flight during driver cleanup
+ * It may have been freed already.
+ */
+ if (!mssg)
+ return;
+ /*
+ * If the return code is non-zero, we should not report the packet
+ * as transmitted. However, we are in IRQ context right now, and we
+ * cannot safely write transmission statistics.
+ */
+ mctp_pcc_ndev = container_of(c, struct mctp_pcc_ndev, outbox.client);
+ dev_consume_skb_any(skb);
+ netif_wake_queue(mctp_pcc_ndev->ndev);
+}
+
+static int mctp_pcc_open(struct net_device *ndev)
+{
+ struct mctp_pcc_ndev *mctp_pcc_ndev = netdev_priv(ndev);
+ struct mctp_pcc_mailbox *outbox, *inbox;
+
+ outbox = &mctp_pcc_ndev->outbox;
+ inbox = &mctp_pcc_ndev->inbox;
+
+ outbox->chan = pcc_mbox_request_channel(&outbox->client, outbox->index);
+ if (IS_ERR(outbox->chan))
+ return PTR_ERR(outbox->chan);
+
+ if (outbox->chan->shmem_size < MCTP_PCC_MIN_SIZE) {
+ pcc_mbox_free_channel(outbox->chan);
+ return -EINVAL;
+ }
+
+ inbox->client.rx_callback = mctp_pcc_client_rx_callback;
+ inbox->chan = pcc_mbox_request_channel(&inbox->client, inbox->index);
+ if (IS_ERR(inbox->chan)) {
+ pcc_mbox_free_channel(outbox->chan);
+ return PTR_ERR(inbox->chan);
+ }
+ if (inbox->chan->shmem_size < MCTP_PCC_MIN_SIZE) {
+ pcc_mbox_free_channel(outbox->chan);
+ pcc_mbox_free_channel(inbox->chan);
+ return -EINVAL;
+ }
+ return 0;
+}
+
+static int mctp_pcc_stop(struct net_device *ndev)
+{
+ struct mctp_pcc_ndev *mctp_pcc_ndev;
+ unsigned int count, idx;
+ struct mbox_chan *chan;
+ struct sk_buff *skb;
+
+ mctp_pcc_ndev = netdev_priv(ndev);
+ chan = mctp_pcc_ndev->outbox.chan->mchan;
+ pcc_mbox_free_channel(mctp_pcc_ndev->inbox.chan);
+ scoped_guard(spinlock_irqsave, &chan->lock) {
+ skb = chan->active_req;
+ chan->active_req = NULL;
+ if (skb) {
+ dev_dstats_tx_dropped(ndev);
+ dev_consume_skb_any(skb);
+ }
+ while (chan->msg_count > 0) {
+ count = chan->msg_count;
+ idx = chan->msg_free;
+ if (idx >= count)
+ idx -= count;
+ else
+ idx += MBOX_TX_QUEUE_LEN - count;
+ skb = chan->msg_data[idx];
+ dev_dstats_tx_dropped(ndev);
+ dev_consume_skb_any(skb);
+ chan->msg_count--;
+ }
+ }
+ pcc_mbox_free_channel(mctp_pcc_ndev->outbox.chan);
+ return 0;
+}
+
+static int mctp_pcc_change_mtu(struct net_device *dev, int new_mtu)
+{
+ if (new_mtu > dev->max_mtu)
+ return -EINVAL;
+ if (new_mtu < MCTP_MIN_MTU)
+ return -EINVAL;
+ return 0;
+}
+
+static const struct net_device_ops mctp_pcc_netdev_ops = {
+ .ndo_open = mctp_pcc_open,
+ .ndo_stop = mctp_pcc_stop,
+ .ndo_start_xmit = mctp_pcc_tx,
+ .ndo_change_mtu = mctp_pcc_change_mtu,
+};
+
+static void mctp_pcc_setup(struct net_device *ndev)
+{
+ ndev->type = ARPHRD_MCTP;
+ ndev->hard_header_len = sizeof(struct acpi_pcct_ext_pcc_shared_memory);
+ ndev->tx_queue_len = 0;
+ ndev->flags = IFF_NOARP;
+ ndev->netdev_ops = &mctp_pcc_netdev_ops;
+ ndev->needs_free_netdev = true;
+ ndev->pcpu_stat_type = NETDEV_PCPU_STAT_DSTATS;
+}
+
+struct mctp_pcc_lookup_context {
+ int index;
+ u32 inbox_index;
+ u32 outbox_index;
+};
+
+static acpi_status lookup_pcct_indices(struct acpi_resource *ares,
+ void *context)
+{
+ struct mctp_pcc_lookup_context *luc = context;
+ struct acpi_resource_address32 *addr;
+
+ if (ares->type != ACPI_RESOURCE_TYPE_ADDRESS32)
+ return AE_OK;
+
+ addr = ACPI_CAST_PTR(struct acpi_resource_address32, &ares->data);
+ switch (luc->index) {
+ case 0:
+ luc->outbox_index = addr[0].address.minimum;
+ break;
+ case 1:
+ luc->inbox_index = addr[0].address.minimum;
+ break;
+ default:
+ return AE_ERROR;
+ }
+ luc->index++;
+ return AE_OK;
+}
+
+static void mctp_cleanup_netdev(void *data)
+{
+ struct net_device *ndev = data;
+
+ mctp_unregister_netdev(ndev);
+}
+
+static int initialize_mtu(struct net_device *ndev)
+{
+ struct mctp_pcc_ndev *mctp_pcc_ndev;
+ struct mctp_pcc_mailbox *outbox;
+ struct pcc_mbox_chan *pchan;
+ int mctp_pcc_max_mtu;
+
+ mctp_pcc_ndev = netdev_priv(ndev);
+ outbox = &mctp_pcc_ndev->outbox;
+ pchan = pcc_mbox_request_channel(&outbox->client, outbox->index);
+ if (IS_ERR(pchan))
+ return PTR_ERR(pchan);
+ if (pchan->shmem_size < MCTP_MIN_MTU + sizeof(struct acpi_pcct_ext_pcc_shared_memory)) {
+ pcc_mbox_free_channel(pchan);
+ return -EINVAL;
+ }
+ mctp_pcc_max_mtu = pchan->shmem_size - sizeof(struct acpi_pcct_ext_pcc_shared_memory);
+ pcc_mbox_free_channel(pchan);
+
+ ndev->mtu = MCTP_MIN_MTU;
+ ndev->max_mtu = mctp_pcc_max_mtu;
+ ndev->min_mtu = MCTP_MIN_MTU;
+
+ return 0;
+}
+
+static int mctp_pcc_driver_add(struct acpi_device *acpi_dev)
+{
+ struct mctp_pcc_lookup_context context = {0};
+ struct mctp_pcc_ndev *mctp_pcc_ndev;
+ struct device *dev = &acpi_dev->dev;
+ struct net_device *ndev;
+ acpi_handle dev_handle;
+ acpi_status status;
+ char name[32];
+ int rc;
+
+ dev_dbg(dev, "Adding mctp_pcc device for HID %s\n",
+ acpi_device_hid(acpi_dev));
+ dev_handle = acpi_device_handle(acpi_dev);
+ status = acpi_walk_resources(dev_handle, "_CRS", lookup_pcct_indices,
+ &context);
+ if (!ACPI_SUCCESS(status)) {
+ dev_err(dev, "FAILED to lookup PCC indexes from CRS\n");
+ return -EINVAL;
+ }
+
+ /*
+ * Ensure we have exactly 2 channels: an outbox and an inbox.
+ */
+ if (context.index != 2)
+ return -EINVAL;
+
+ snprintf(name, sizeof(name), "mctppcc%d", context.inbox_index);
+ ndev = alloc_netdev(sizeof(*mctp_pcc_ndev), name, NET_NAME_PREDICTABLE,
+ mctp_pcc_setup);
+ if (!ndev)
+ return -ENOMEM;
+
+ mctp_pcc_ndev = netdev_priv(ndev);
+
+ mctp_pcc_ndev->inbox.index = context.inbox_index;
+ mctp_pcc_ndev->inbox.client.dev = dev;
+ mctp_pcc_ndev->outbox.index = context.outbox_index;
+ mctp_pcc_ndev->outbox.client.dev = dev;
+
+ mctp_pcc_ndev->outbox.client.tx_prepare = mctp_pcc_tx_prepare;
+ mctp_pcc_ndev->outbox.client.tx_done = mctp_pcc_tx_done;
+ mctp_pcc_ndev->acpi_device = acpi_dev;
+ mctp_pcc_ndev->ndev = ndev;
+ acpi_dev->driver_data = mctp_pcc_ndev;
+
+ rc = initialize_mtu(ndev);
+ if (rc)
+ goto free_netdev;
+
+ rc = mctp_register_netdev(ndev, NULL, MCTP_PHYS_BINDING_PCC);
+ if (rc)
+ goto free_netdev;
+
+ return devm_add_action_or_reset(dev, mctp_cleanup_netdev, ndev);
+free_netdev:
+ free_netdev(ndev);
+ return rc;
+}
+
+static const struct acpi_device_id mctp_pcc_device_ids[] = {
+ { "DMT0001" },
+ {}
+};
+
+static struct acpi_driver mctp_pcc_driver = {
+ .name = "mctp_pcc",
+ .class = "Unknown",
+ .ids = mctp_pcc_device_ids,
+ .ops = {
+ .add = mctp_pcc_driver_add,
+ },
+};
+
+module_acpi_driver(mctp_pcc_driver);
+
+MODULE_DEVICE_TABLE(acpi, mctp_pcc_device_ids);
+
+MODULE_DESCRIPTION("MCTP PCC ACPI device");
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Adam Young <admiyo@os.amperecomputing.com>");
--
2.43.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [net-next v39] mctp pcc: Implement MCTP over PCC Transport
2026-05-06 21:43 [net-next v39] mctp pcc: Implement MCTP over PCC Transport Adam Young
@ 2026-05-07 3:02 ` Jeremy Kerr
2026-05-07 16:00 ` Adam Young
0 siblings, 1 reply; 4+ messages in thread
From: Jeremy Kerr @ 2026-05-07 3:02 UTC (permalink / raw)
To: Adam Young, Matt Johnston, Andrew Lunn, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: netdev, linux-kernel, Sudeep Holla, Jonathan Cameron, Huisong Li
Hi Adam,
> +static void mctp_pcc_client_rx_callback(struct mbox_client *cl, void *mssg)
> +{
> + struct acpi_pcct_ext_pcc_shared_memory pcc_header;
> + struct mctp_pcc_ndev *mctp_pcc_ndev;
> + struct mctp_pcc_mailbox *inbox;
> + struct mctp_skb_cb *cb;
> + struct sk_buff *skb;
> + u32 header_length;
> + int size;
> +
> + mctp_pcc_ndev = container_of(cl, struct mctp_pcc_ndev, inbox.client);
> + inbox = &mctp_pcc_ndev->inbox;
> + memcpy_fromio(&pcc_header, inbox->chan->shmem, sizeof(pcc_header));
> +
> + // The message must at least have the PCC command indicating it is an MCTP
> + // message followed by the MCTP header, or we have a malformed message.
> + // This may be run on big endian system, but the data in the buffer is
> + // explicitly little endian.
> + header_length = le32_to_cpu(pcc_header.length);
> + if (header_length < sizeof(pcc_header.command) + sizeof(struct mctp_hdr)) {
> + dev_dstats_rx_dropped(mctp_pcc_ndev->ndev);
> + return;
> + }
> + // If the reported size is larger than the shared memory minus headers,
> + // something is wrong and treat the buffer as corrupted data.
> + if (header_length > inbox->chan->shmem_size - PCC_EXTRA_LEN) {
> + dev_dstats_rx_dropped(mctp_pcc_ndev->ndev);
> + return;
> + }
> + if (memcmp(&pcc_header.command, MCTP_SIGNATURE, MCTP_SIGNATURE_LENGTH) != 0) {
> + dev_dstats_rx_dropped(mctp_pcc_ndev->ndev);
> + return;
> + }
> +
> + size = header_length + PCC_EXTRA_LEN;
> + skb = netdev_alloc_skb(mctp_pcc_ndev->ndev, size);
> + if (!skb) {
> + dev_dstats_rx_dropped(mctp_pcc_ndev->ndev);
> + return;
> + }
Minor, but you have four instances of the dstats_rx_dropped call, that
may work as a shared error path.
> + skb_put(skb, size);
> + skb->protocol = htons(ETH_P_MCTP);
> + memcpy_fromio(skb->data, inbox->chan->shmem, size);
> + dev_dstats_rx_add(mctp_pcc_ndev->ndev, size);
> + skb_pull(skb, sizeof(pcc_header));
> + skb_reset_mac_header(skb);
> + skb_reset_network_header(skb);
> + cb = __mctp_cb(skb);
> + cb->halen = 0;
> + netif_rx(skb);
> +}
> +
> +static netdev_tx_t mctp_pcc_tx(struct sk_buff *skb, struct net_device *ndev)
> +{
> + struct acpi_pcct_ext_pcc_shared_memory *pcc_header;
> + struct mctp_pcc_ndev *mpnd = netdev_priv(ndev);
> + int len = skb->len;
> +
> + /* Consolidated a fragmented packet into contiguous memory */
> + if (skb_is_nonlinear(skb)) {
> + if (skb_linearize(skb))
> + goto error;
> + }
skb_linearize() already has the skb_is_nonlinear() check.
However, you don't need to call skb_linearize() anyway, as that will
happen for you in validate_xmit_skb(), since the driver does not
advertise support for nonlinear skbs.
> +
> + if (skb_cow_head(skb, sizeof(*pcc_header)))
> + goto error;
> +
> + /**
> + * This code could potentially be run on A Big Endian
> + * System. The ACPI specification requires that values
> + * in the shared buffer be little endian.
> + */
Minor, but I don't think this comment is necessary.
More importantly though, you're getting a bunch of warnings from the use
of these endian helpers:
drivers/net/mctp/mctp-pcc.c:70:25: warning: cast to restricted __le32
drivers/net/mctp/mctp-pcc.c:125:31: warning: incorrect type in assignment (different base types)
drivers/net/mctp/mctp-pcc.c:125:31: expected unsigned int [usertype] signature
drivers/net/mctp/mctp-pcc.c:125:31: got restricted __le32 [usertype]
drivers/net/mctp/mctp-pcc.c:126:27: warning: incorrect type in assignment (different base types)
drivers/net/mctp/mctp-pcc.c:126:27: expected unsigned int [usertype] flags
drivers/net/mctp/mctp-pcc.c:126:27: got restricted __le32 [usertype]
drivers/net/mctp/mctp-pcc.c:128:28: warning: incorrect type in assignment (different base types)
drivers/net/mctp/mctp-pcc.c:128:28: expected unsigned int [usertype] length
drivers/net/mctp/mctp-pcc.c:128:28: got restricted __le32 [usertype]
Looks like struct acpi_pcct_ext_pcc_shared_memory does not specify
fields as little endian, but host endian.
You could change to endian-annotated types to those ACPI structs, but
there are probably reasons why that's not already done. This might be an
indication that there are no big-endian ACPI platforms, so you don't
need the endian conversions?
> + pcc_header = skb_push(skb, sizeof(*pcc_header));
> + pcc_header->signature = cpu_to_le32(PCC_SIGNATURE | mpnd->outbox.index);
> + pcc_header->flags = cpu_to_le32(PCC_CMD_COMPLETION_NOTIFY);
> + memcpy(&pcc_header->command, MCTP_SIGNATURE, MCTP_SIGNATURE_LENGTH);
> + pcc_header->length = cpu_to_le32(len + MCTP_SIGNATURE_LENGTH);
> +
> + if (mbox_send_message(mpnd->outbox.chan->mchan, skb) < 0) {
> + // Remove the header in case it gets sent again
> + skb_pull(skb, sizeof(*pcc_header));
> + netif_stop_queue(ndev);
> + return NETDEV_TX_BUSY;
> + }
> +
> + return NETDEV_TX_OK;
> +error:
> + dev_dstats_tx_dropped(ndev);
> + kfree_skb(skb);
> + return NETDEV_TX_OK;
> +}
> +
> +static void mctp_pcc_tx_prepare(struct mbox_client *cl, void *mssg)
> +{
> + struct mctp_pcc_ndev *mctp_pcc_ndev;
> + struct mctp_pcc_mailbox *outbox;
> + struct sk_buff *skb = mssg;
> +
> + mctp_pcc_ndev = container_of(cl, struct mctp_pcc_ndev, outbox.client);
> + outbox = &mctp_pcc_ndev->outbox;
> +
> + /* The PCC Mailbox typically does not make use of the mssg pointer
> + * The mctp-over pcc driver is the only client that uses it.
> + * This value should always be non-null; it is possible
> + * that a change in the Mailbox level will break that assumption.
> + */
> + if (!skb) {
> + WARN_ONCE(!skb, "%s called with null message.\n", __func__);
> + return;
> + }
I'd suggest netdev_warn_once(), so we know which device is causing this.
> +
> + if (skb->len > outbox->chan->shmem_size) {
> + dev_dstats_tx_dropped(mctp_pcc_ndev->ndev);
> + return;
> + }
> + memcpy_toio(outbox->chan->shmem, skb->data, skb->len);
Super minor: double space before skb->data there.
> + dev_dstats_tx_add(mctp_pcc_ndev->ndev, skb->len);
> +}
> +
> +static void mctp_pcc_tx_done(struct mbox_client *c, void *mssg, int r)
> +{
> + struct mctp_pcc_ndev *mctp_pcc_ndev;
> + struct sk_buff *skb = mssg;
> +
> + /*
> + * If there is a packet in flight during driver cleanup
> + * It may have been freed already.
> + */
> + if (!mssg)
> + return;
> + /*
> + * If the return code is non-zero, we should not report the packet
> + * as transmitted. However, we are in IRQ context right now, and we
> + * cannot safely write transmission statistics.
> + */
This reads as if you're not updating stats at all, but you do so in
mctp_pcc_tx_prepare(). I don't think this comment is necessary - if
you really want to mention this, add a comment on the
dev_dstats_tx_add() to indicate why you're calling it early.
> + mctp_pcc_ndev = container_of(c, struct mctp_pcc_ndev, outbox.client);
> + dev_consume_skb_any(skb);
> + netif_wake_queue(mctp_pcc_ndev->ndev);
> +}
> +
> +static int mctp_pcc_open(struct net_device *ndev)
> +{
> + struct mctp_pcc_ndev *mctp_pcc_ndev = netdev_priv(ndev);
> + struct mctp_pcc_mailbox *outbox, *inbox;
> +
> + outbox = &mctp_pcc_ndev->outbox;
> + inbox = &mctp_pcc_ndev->inbox;
> +
> + outbox->chan = pcc_mbox_request_channel(&outbox->client, outbox->index);
> + if (IS_ERR(outbox->chan))
> + return PTR_ERR(outbox->chan);
> +
> + if (outbox->chan->shmem_size < MCTP_PCC_MIN_SIZE) {
Minor: odd spacing here.
> + pcc_mbox_free_channel(outbox->chan);
> + return -EINVAL;
> + }
> +
> + inbox->client.rx_callback = mctp_pcc_client_rx_callback;
> + inbox->chan = pcc_mbox_request_channel(&inbox->client, inbox->index);
> + if (IS_ERR(inbox->chan)) {
> + pcc_mbox_free_channel(outbox->chan);
> + return PTR_ERR(inbox->chan);
> + }
> + if (inbox->chan->shmem_size < MCTP_PCC_MIN_SIZE) {
> + pcc_mbox_free_channel(outbox->chan);
> + pcc_mbox_free_channel(inbox->chan);
> + return -EINVAL;
> + }
> + return 0;
> +}
> +
> +static int mctp_pcc_stop(struct net_device *ndev)
> +{
> + struct mctp_pcc_ndev *mctp_pcc_ndev;
> + unsigned int count, idx;
> + struct mbox_chan *chan;
> + struct sk_buff *skb;
> +
> + mctp_pcc_ndev = netdev_priv(ndev);
> + chan = mctp_pcc_ndev->outbox.chan->mchan;
> + pcc_mbox_free_channel(mctp_pcc_ndev->inbox.chan);
> + scoped_guard(spinlock_irqsave, &chan->lock) {
> + skb = chan->active_req;
> + chan->active_req = NULL;
> + if (skb) {
> + dev_dstats_tx_dropped(ndev);
> + dev_consume_skb_any(skb);
> + }
> + while (chan->msg_count > 0) {
> + count = chan->msg_count;
> + idx = chan->msg_free;
> + if (idx >= count)
> + idx -= count;
> + else
> + idx += MBOX_TX_QUEUE_LEN - count;
> + skb = chan->msg_data[idx];
> + dev_dstats_tx_dropped(ndev);
> + dev_consume_skb_any(skb);
> + chan->msg_count--;
> + }
> + }
I'm going to assume this is all OK with respect to PCC channel activity.
> + pcc_mbox_free_channel(mctp_pcc_ndev->outbox.chan);
> + return 0;
> +}
> +
> +static int mctp_pcc_change_mtu(struct net_device *dev, int new_mtu)
> +{
> + if (new_mtu > dev->max_mtu)
> + return -EINVAL;
> + if (new_mtu < MCTP_MIN_MTU)
> + return -EINVAL;
> + return 0;
> +}
The core already does these checks, I don't think you need a
ndo_change_mtu callback at all.
Cheers,
Jeremy
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [net-next v39] mctp pcc: Implement MCTP over PCC Transport
2026-05-07 3:02 ` Jeremy Kerr
@ 2026-05-07 16:00 ` Adam Young
2026-05-08 2:06 ` Jeremy Kerr
0 siblings, 1 reply; 4+ messages in thread
From: Adam Young @ 2026-05-07 16:00 UTC (permalink / raw)
To: Jeremy Kerr, Adam Young, Matt Johnston, Andrew Lunn,
David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: netdev, linux-kernel, Sudeep Holla, Jonathan Cameron, Huisong Li
All the other will be taken in as changes
On 5/6/26 23:02, Jeremy Kerr wrote:
>> +static void mctp_pcc_tx_done(struct mbox_client *c, void *mssg, int r)
>> +{
>> + struct mctp_pcc_ndev *mctp_pcc_ndev;
>> + struct sk_buff *skb = mssg;
>> +
>> + /*
>> + * If there is a packet in flight during driver cleanup
>> + * It may have been freed already.
>> + */
>> + if (!mssg)
>> + return;
>> + /*
>> + * If the return code is non-zero, we should not report the packet
>> + * as transmitted. However, we are in IRQ context right now, and we
>> + * cannot safely write transmission statistics.
>> + */
> This reads as if you're not updating stats at all, but you do so in
> mctp_pcc_tx_prepare(). I don't think this comment is necessary - if
> you really want to mention this, add a comment on the
> dev_dstats_tx_add() to indicate why you're calling it early.
This comment is in prep for a fairly large change in the PCC layer to
address it.
This statistic should be reported in tx_done, but cannot be done safely
yet. The fix is to get tx_done out of a hard-irq handler. I will submit
that as a follow on changes to mailbox/pcc.c and mctp-pc.c
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [net-next v39] mctp pcc: Implement MCTP over PCC Transport
2026-05-07 16:00 ` Adam Young
@ 2026-05-08 2:06 ` Jeremy Kerr
0 siblings, 0 replies; 4+ messages in thread
From: Jeremy Kerr @ 2026-05-08 2:06 UTC (permalink / raw)
To: Adam Young, Adam Young, Matt Johnston, Andrew Lunn,
David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: netdev, linux-kernel, Sudeep Holla, Huisong Li
Hi Adam,
> > This reads as if you're not updating stats at all, but you do so in
> > mctp_pcc_tx_prepare(). I don't think this comment is necessary - if
> > you really want to mention this, add a comment on the
> > dev_dstats_tx_add() to indicate why you're calling it early.
>
> This comment is in prep for a fairly large change in the PCC layer to
> address it.
>
> This statistic should be reported in tx_done, but cannot be done safely
> yet. The fix is to get tx_done out of a hard-irq handler. I will submit
> that as a follow on changes to mailbox/pcc.c and mctp-pc.c
That's all good, especially if there's a fix coming, but my suggestion
still stands - it would make more sense to comment the stats update in
_prepare(), rather than the absence of a stats update in _done().
Cheers,
Jeremy
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-05-08 2:06 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-06 21:43 [net-next v39] mctp pcc: Implement MCTP over PCC Transport Adam Young
2026-05-07 3:02 ` Jeremy Kerr
2026-05-07 16:00 ` Adam Young
2026-05-08 2:06 ` Jeremy Kerr
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox