linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next v25 1/1] mctp pcc: Implement MCTP over PCC Transport
  2025-08-19 20:51 admiyo
@ 2025-08-19 20:51 ` admiyo
  2025-08-22  8:21   ` Jeremy Kerr
  0 siblings, 1 reply; 14+ messages in thread
From: admiyo @ 2025-08-19 20:51 UTC (permalink / raw)
  To: Jeremy Kerr, Matt Johnston, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Adam Young
  Cc: netdev, linux-kernel, Sudeep Holla, Jonathan Cameron, Huisong Li

From: Adam Young <admiyo@os.amperecomputing.com>

Implementation of network driver for
Management Control Transport Protocol(MCTP)
over Platform Communication Channel(PCC)

DMTF DSP:0292
https://www.dmtf.org/sites/default/files/standards/documents/\
DSP0292_1.0.0WIP50.pdf

MCTP devices are specified via ACPI by entries
in DSDT/SDST and reference channels specified
in the PCCT.  Messages are sent on a type 3 and
received on a type 4 channel.  Communication with
other devices use the PCC based doorbell mechanism;
a shared memory segment with a corresponding
interrupt and a memory register used to trigger
remote interrupts.

This driver takes advantage of PCC mailbox buffer
management. The data section of the struct sk_buff
that contains the outgoing packet is sent to the mailbox,
already properly formatted  as a PCC message.  The driver
is also responsible for allocating a struct sk_buff that
is then passed to the mailbox and used to record the
data in the shared buffer. It maintains a list of both
outging and incoming sk_buffs to match the data buffers

When the Type 3 channel outbox receives a txdone response
interrupt, it consumes the outgoing sk_buff, allowing
it to be freed.

with the original sk_buffs. This requires a netdevice
specific spinlock.  Optimizing this would require a
change to the mailbox API.

Bringing the interface up and down creates and frees
the channel between the network driver and the mailbox
driver. Freeig the channel also frees any packets that
are cached in the mailbox ringbuffer.

Signed-off-by: Adam Young <admiyo@os.amperecomputing.com>
---
 MAINTAINERS                 |   5 +
 drivers/net/mctp/Kconfig    |  13 ++
 drivers/net/mctp/Makefile   |   1 +
 drivers/net/mctp/mctp-pcc.c | 379 ++++++++++++++++++++++++++++++++++++
 4 files changed, 398 insertions(+)
 create mode 100644 drivers/net/mctp/mctp-pcc.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 4dcce7a5894b..984907a41e2d 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -14660,6 +14660,11 @@ F:	include/net/mctpdevice.h
 F:	include/net/netns/mctp.h
 F:	net/mctp/
 
+MANAGEMENT COMPONENT TRANSPORT PROTOCOL (MCTP) over PCC (MCTP-PCC) Driver
+M:	Adam Young <admiyo@os.amperecomputing.com>
+S:	Maintained
+F:	drivers/net/mctp/mctp-pcc.c
+
 MAPLE TREE
 M:	Liam R. Howlett <Liam.Howlett@oracle.com>
 L:	maple-tree@lists.infradead.org
diff --git a/drivers/net/mctp/Kconfig b/drivers/net/mctp/Kconfig
index cf325ab0b1ef..f69d0237f058 100644
--- a/drivers/net/mctp/Kconfig
+++ b/drivers/net/mctp/Kconfig
@@ -57,6 +57,19 @@ config MCTP_TRANSPORT_USB
 	  MCTP-over-USB interfaces are peer-to-peer, so each interface
 	  represents a physical connection to one remote MCTP endpoint.
 
+config MCTP_TRANSPORT_PCC
+	tristate "MCTP PCC transport"
+	depends on ACPI
+	help
+	  Provides a driver to access MCTP devices over PCC transport,
+	  A MCTP protocol network device is created via ACPI for each
+	  entry in the DST/SDST that matches the identifier. The Platform
+	  communication channels are selected from the corresponding
+	  entries in the PCCT.
+
+	  Say y here if you need to connect to MCTP endpoints over PCC. To
+	  compile as a module, use m; the module will be called mctp-pcc.
+
 endmenu
 
 endif
diff --git a/drivers/net/mctp/Makefile b/drivers/net/mctp/Makefile
index c36006849a1e..2276f148df7c 100644
--- a/drivers/net/mctp/Makefile
+++ b/drivers/net/mctp/Makefile
@@ -1,3 +1,4 @@
+obj-$(CONFIG_MCTP_TRANSPORT_PCC) += mctp-pcc.o
 obj-$(CONFIG_MCTP_SERIAL) += mctp-serial.o
 obj-$(CONFIG_MCTP_TRANSPORT_I2C) += mctp-i2c.o
 obj-$(CONFIG_MCTP_TRANSPORT_I3C) += mctp-i3c.o
diff --git a/drivers/net/mctp/mctp-pcc.c b/drivers/net/mctp/mctp-pcc.c
new file mode 100644
index 000000000000..7f5d0245b73b
--- /dev/null
+++ b/drivers/net/mctp/mctp-pcc.c
@@ -0,0 +1,379 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * mctp-pcc.c - Driver for MCTP over PCC.
+ * Copyright (c) 2024-2025, Ampere Computing LLC
+ *
+ */
+
+/* Implementation of MCTP over PCC DMTF Specification DSP0256
+ * https://www.dmtf.org/sites/default/files/standards/documents/DSP0292_1.0.0WIP50.pdf
+ */
+
+#include <linux/acpi.h>
+#include <linux/if_arp.h>
+#include <linux/init.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/netdevice.h>
+#include <linux/platform_device.h>
+#include <linux/string.h>
+#include <linux/skbuff.h>
+#include <linux/hrtimer.h>
+
+#include <acpi/acpi_bus.h>
+#include <acpi/acpi_drivers.h>
+#include <acpi/acrestyp.h>
+#include <acpi/actbl.h>
+#include <net/mctp.h>
+#include <net/mctpdevice.h>
+#include <acpi/pcc.h>
+
+#include "../../mailbox/mailbox.h"
+
+#define MCTP_SIGNATURE          "MCTP"
+#define MCTP_SIGNATURE_LENGTH   (sizeof(MCTP_SIGNATURE) - 1)
+#define MCTP_MIN_MTU            68
+#define PCC_DWORD_TYPE          0x0c
+
+struct mctp_pcc_mailbox {
+	u32 index;
+	struct pcc_mbox_chan *chan;
+	struct mbox_client client;
+	struct sk_buff_head packets;
+};
+
+/* The netdev structure. One of these per PCC adapter. */
+struct mctp_pcc_ndev {
+	/* spinlock to serialize access to queue that holds a copy of the
+	 * sk_buffs that are also in the ring buffers of the mailbox.
+	 */
+	spinlock_t lock;
+	struct net_device *ndev;
+	struct acpi_device *acpi_device;
+	struct mctp_pcc_mailbox inbox;
+	struct mctp_pcc_mailbox outbox;
+};
+
+static void *mctp_pcc_rx_alloc(struct mbox_client *c, int size)
+{
+	struct mctp_pcc_ndev *mctp_pcc_ndev;
+	struct mctp_pcc_mailbox *box;
+	struct sk_buff *skb;
+
+	mctp_pcc_ndev =	container_of(c, struct mctp_pcc_ndev, inbox.client);
+	box = &mctp_pcc_ndev->inbox;
+
+	if (size > mctp_pcc_ndev->ndev->mtu)
+		return NULL;
+	skb = netdev_alloc_skb(mctp_pcc_ndev->ndev, size);
+	if (!skb)
+		return NULL;
+	skb_put(skb, size);
+	skb->protocol = htons(ETH_P_MCTP);
+
+	spin_lock(&mctp_pcc_ndev->lock);
+	skb_queue_head(&box->packets, skb);
+	spin_unlock(&mctp_pcc_ndev->lock);
+
+	return skb->data;
+}
+
+static void mctp_pcc_client_rx_callback(struct mbox_client *c, void *buffer)
+{
+	struct mctp_pcc_ndev *mctp_pcc_ndev;
+	struct pcc_header pcc_header;
+	struct sk_buff *skb = NULL;
+	struct mctp_skb_cb *cb;
+
+	mctp_pcc_ndev = container_of(c, struct mctp_pcc_ndev, inbox.client);
+	if (!buffer) {
+		dev_dstats_rx_dropped(mctp_pcc_ndev->ndev);
+		return;
+	}
+
+	spin_lock(&mctp_pcc_ndev->lock);
+	skb_queue_walk(&mctp_pcc_ndev->inbox.packets, skb) {
+		if (skb->data != buffer)
+			continue;
+		skb_unlink(skb, &mctp_pcc_ndev->inbox.packets);
+		break;
+	}
+	spin_unlock(&mctp_pcc_ndev->lock);
+
+	if (skb) {
+		dev_dstats_rx_add(mctp_pcc_ndev->ndev, skb->len);
+		skb_reset_mac_header(skb);
+		skb_pull(skb, sizeof(pcc_header));
+		skb_reset_network_header(skb);
+		cb = __mctp_cb(skb);
+		cb->halen = 0;
+		netif_rx(skb);
+	}
+}
+
+static void mctp_pcc_tx_done(struct mbox_client *c, void *mssg, int r)
+{
+	struct mctp_pcc_ndev *mctp_pcc_ndev;
+	struct mctp_pcc_mailbox *box;
+	struct sk_buff *skb = NULL;
+
+	mctp_pcc_ndev = container_of(c, struct mctp_pcc_ndev, outbox.client);
+	box = container_of(c, struct mctp_pcc_mailbox, client);
+	spin_lock(&mctp_pcc_ndev->lock);
+	skb_queue_walk(&box->packets, skb) {
+		if (skb->data == mssg) {
+			skb_unlink(skb, &box->packets);
+			break;
+		}
+	}
+	spin_unlock(&mctp_pcc_ndev->lock);
+
+	if (skb)
+		dev_consume_skb_any(skb);
+}
+
+static netdev_tx_t mctp_pcc_tx(struct sk_buff *skb, struct net_device *ndev)
+{
+	struct mctp_pcc_ndev *mpnd = netdev_priv(ndev);
+	struct pcc_header *pcc_header;
+	int len = skb->len;
+	int rc;
+
+	rc = skb_cow_head(skb, sizeof(*pcc_header));
+	if (rc) {
+		dev_dstats_tx_dropped(ndev);
+		kfree_skb(skb);
+		return NETDEV_TX_OK;
+	}
+
+	pcc_header = skb_push(skb, sizeof(*pcc_header));
+	pcc_header->signature = PCC_SIGNATURE | mpnd->outbox.index;
+	pcc_header->flags = PCC_CMD_COMPLETION_NOTIFY;
+	memcpy(&pcc_header->command, MCTP_SIGNATURE, MCTP_SIGNATURE_LENGTH);
+	pcc_header->length = len + MCTP_SIGNATURE_LENGTH;
+
+	spin_lock(&mpnd->lock);
+	skb_queue_head(&mpnd->outbox.packets, skb);
+	spin_unlock(&mpnd->lock);
+
+	rc = mbox_send_message(mpnd->outbox.chan->mchan, skb->data);
+
+	if (rc < 0) {
+		skb_unlink(skb, &mpnd->outbox.packets);
+		return NETDEV_TX_BUSY;
+	}
+
+	dev_dstats_tx_add(ndev, len);
+	return NETDEV_TX_OK;
+}
+
+static void drain_packets(struct sk_buff_head *list)
+{
+	struct sk_buff *skb;
+
+	while (!skb_queue_empty(list)) {
+		skb = skb_dequeue(list);
+		dev_consume_skb_any(skb);
+	}
+}
+
+static int mctp_pcc_ndo_open(struct net_device *ndev)
+{
+	struct mctp_pcc_ndev *mctp_pcc_ndev =
+	    netdev_priv(ndev);
+	struct mctp_pcc_mailbox *outbox =
+	    &mctp_pcc_ndev->outbox;
+	struct mctp_pcc_mailbox *inbox =
+	    &mctp_pcc_ndev->inbox;
+	int mctp_pcc_mtu;
+
+	outbox->chan = pcc_mbox_request_channel(&outbox->client, outbox->index);
+	if (IS_ERR(outbox->chan))
+		return PTR_ERR(outbox->chan);
+
+	inbox->chan = pcc_mbox_request_channel(&inbox->client, inbox->index);
+	if (IS_ERR(inbox->chan)) {
+		pcc_mbox_free_channel(outbox->chan);
+		return PTR_ERR(inbox->chan);
+	}
+
+	mctp_pcc_ndev->inbox.chan->rx_alloc = mctp_pcc_rx_alloc;
+	mctp_pcc_ndev->outbox.chan->manage_writes = true;
+
+	/* There is no clean way to pass the MTU to the callback function
+	 * used for registration, so set the values ahead of time.
+	 */
+	mctp_pcc_mtu = mctp_pcc_ndev->outbox.chan->shmem_size -
+		sizeof(struct pcc_header);
+	ndev->mtu = MCTP_MIN_MTU;
+	ndev->max_mtu = mctp_pcc_mtu;
+	ndev->min_mtu = MCTP_MIN_MTU;
+
+	return 0;
+}
+
+static int mctp_pcc_ndo_stop(struct net_device *ndev)
+{
+	struct mctp_pcc_ndev *mctp_pcc_ndev =
+	    netdev_priv(ndev);
+	struct mctp_pcc_mailbox *outbox =
+	    &mctp_pcc_ndev->outbox;
+	struct mctp_pcc_mailbox *inbox =
+	    &mctp_pcc_ndev->inbox;
+
+	pcc_mbox_free_channel(outbox->chan);
+	pcc_mbox_free_channel(inbox->chan);
+
+	spin_lock(&mctp_pcc_ndev->lock);
+	drain_packets(&mctp_pcc_ndev->outbox.packets);
+	drain_packets(&mctp_pcc_ndev->inbox.packets);
+	spin_unlock(&mctp_pcc_ndev->lock);
+	return 0;
+}
+
+static const struct net_device_ops mctp_pcc_netdev_ops = {
+	.ndo_open = mctp_pcc_ndo_open,
+	.ndo_stop = mctp_pcc_ndo_stop,
+	.ndo_start_xmit = mctp_pcc_tx,
+
+};
+
+static void mctp_pcc_setup(struct net_device *ndev)
+{
+	ndev->type = ARPHRD_MCTP;
+	ndev->hard_header_len = 0;
+	ndev->tx_queue_len = 0;
+	ndev->flags = IFF_NOARP;
+	ndev->netdev_ops = &mctp_pcc_netdev_ops;
+	ndev->needs_free_netdev = true;
+	ndev->pcpu_stat_type = NETDEV_PCPU_STAT_DSTATS;
+}
+
+struct mctp_pcc_lookup_context {
+	int index;
+	u32 inbox_index;
+	u32 outbox_index;
+};
+
+static acpi_status lookup_pcct_indices(struct acpi_resource *ares,
+				       void *context)
+{
+	struct mctp_pcc_lookup_context *luc = context;
+	struct acpi_resource_address32 *addr;
+
+	if (ares->type != PCC_DWORD_TYPE)
+		return AE_OK;
+
+	addr = ACPI_CAST_PTR(struct acpi_resource_address32, &ares->data);
+	switch (luc->index) {
+	case 0:
+		luc->outbox_index = addr[0].address.minimum;
+		break;
+	case 1:
+		luc->inbox_index = addr[0].address.minimum;
+		break;
+	}
+	luc->index++;
+	return AE_OK;
+}
+
+static void mctp_cleanup_netdev(void *data)
+{
+	struct net_device *ndev = data;
+
+	mctp_unregister_netdev(ndev);
+}
+
+static int mctp_pcc_initialize_mailbox(struct device *dev,
+				       struct mctp_pcc_mailbox *box, u32 index)
+{
+	box->index = index;
+	skb_queue_head_init(&box->packets);
+	box->client.dev = dev;
+	return 0;
+}
+
+static int mctp_pcc_driver_add(struct acpi_device *acpi_dev)
+{
+	struct mctp_pcc_lookup_context context = {0};
+	struct mctp_pcc_ndev *mctp_pcc_ndev;
+	struct device *dev = &acpi_dev->dev;
+	struct net_device *ndev;
+	acpi_handle dev_handle;
+	acpi_status status;
+	char name[32];
+	int rc;
+
+	dev_dbg(dev, "Adding mctp_pcc device for HID %s\n",
+		acpi_device_hid(acpi_dev));
+	dev_handle = acpi_device_handle(acpi_dev);
+	status = acpi_walk_resources(dev_handle, "_CRS", lookup_pcct_indices,
+				     &context);
+	if (!ACPI_SUCCESS(status)) {
+		dev_err(dev, "FAILURE to lookup PCC indexes from CRS\n");
+		return -EINVAL;
+	}
+
+	snprintf(name, sizeof(name), "mctpipcc%d", context.inbox_index);
+	ndev = alloc_netdev(sizeof(*mctp_pcc_ndev), name, NET_NAME_PREDICTABLE,
+			    mctp_pcc_setup);
+	if (!ndev)
+		return -ENOMEM;
+
+	mctp_pcc_ndev = netdev_priv(ndev);
+	spin_lock_init(&mctp_pcc_ndev->lock);
+
+	/* inbox initialization */
+	rc = mctp_pcc_initialize_mailbox(dev, &mctp_pcc_ndev->inbox,
+					 context.inbox_index);
+	if (rc)
+		goto free_netdev;
+
+	mctp_pcc_ndev->inbox.client.rx_callback = mctp_pcc_client_rx_callback;
+
+	/* outbox initialization */
+	rc = mctp_pcc_initialize_mailbox(dev, &mctp_pcc_ndev->outbox,
+					 context.outbox_index);
+	if (rc)
+		goto free_netdev;
+
+	mctp_pcc_ndev->outbox.client.tx_done = mctp_pcc_tx_done;
+	mctp_pcc_ndev->acpi_device = acpi_dev;
+	mctp_pcc_ndev->ndev = ndev;
+	acpi_dev->driver_data = mctp_pcc_ndev;
+
+	/* ndev needs to be freed before the iomemory (mapped above) gets
+	 * unmapped,  devm resources get freed in reverse to the order they
+	 * are added.
+	 */
+	rc = mctp_register_netdev(ndev, NULL, MCTP_PHYS_BINDING_PCC);
+	if (rc)
+		goto free_netdev;
+
+	return devm_add_action_or_reset(dev, mctp_cleanup_netdev, ndev);
+free_netdev:
+	free_netdev(ndev);
+	return rc;
+}
+
+static const struct acpi_device_id mctp_pcc_device_ids[] = {
+	{ "DMT0001" },
+	{}
+};
+
+static struct acpi_driver mctp_pcc_driver = {
+	.name = "mctp_pcc",
+	.class = "Unknown",
+	.ids = mctp_pcc_device_ids,
+	.ops = {
+		.add = mctp_pcc_driver_add,
+	},
+};
+
+module_acpi_driver(mctp_pcc_driver);
+
+MODULE_DEVICE_TABLE(acpi, mctp_pcc_device_ids);
+
+MODULE_DESCRIPTION("MCTP PCC ACPI device");
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Adam Young <admiyo@os.amperecomputing.com>");
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next v25 1/1] mctp pcc: Implement MCTP over PCC Transport
  2025-08-19 20:51 ` [PATCH net-next v25 1/1] mctp pcc: Implement MCTP over " admiyo
@ 2025-08-22  8:21   ` Jeremy Kerr
  2025-08-26 20:51     ` Adam Young
  2025-08-26 22:37     ` Adam Young
  0 siblings, 2 replies; 14+ messages in thread
From: Jeremy Kerr @ 2025-08-22  8:21 UTC (permalink / raw)
  To: admiyo, Matt Johnston, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni
  Cc: netdev, linux-kernel, Sudeep Holla, Jonathan Cameron, Huisong Li

Hi Adam,

A few comments inline, mainly on the locking & list traversal.

> +struct mctp_pcc_mailbox {
> +       u32 index;
> +       struct pcc_mbox_chan *chan;
> +       struct mbox_client client;
> +       struct sk_buff_head packets;
> +};
> +
> +/* The netdev structure. One of these per PCC adapter. */
> +struct mctp_pcc_ndev {
> +       /* spinlock to serialize access to queue that holds a copy of the
> +        * sk_buffs that are also in the ring buffers of the mailbox.
> +        */
> +       spinlock_t lock;
> +       struct net_device *ndev;
> +       struct acpi_device *acpi_device;
> +       struct mctp_pcc_mailbox inbox;
> +       struct mctp_pcc_mailbox outbox;
> +};
> +
> +static void *mctp_pcc_rx_alloc(struct mbox_client *c, int size)
> +{
> +       struct mctp_pcc_ndev *mctp_pcc_ndev;
> +       struct mctp_pcc_mailbox *box;
> +       struct sk_buff *skb;
> +
> +       mctp_pcc_ndev = container_of(c, struct mctp_pcc_ndev, inbox.client);
> +       box = &mctp_pcc_ndev->inbox;
> +
> +       if (size > mctp_pcc_ndev->ndev->mtu)
> +               return NULL;
> +       skb = netdev_alloc_skb(mctp_pcc_ndev->ndev, size);
> +       if (!skb)
> +               return NULL;
> +       skb_put(skb, size);
> +       skb->protocol = htons(ETH_P_MCTP);
> +
> +       spin_lock(&mctp_pcc_ndev->lock);
> +       skb_queue_head(&box->packets, skb);
> +       spin_unlock(&mctp_pcc_ndev->lock);

Given you're using your own locking here (which seems sensible, since
you're iterating elsewhere), you're not relying on the list's lock. In
which case you can use __skb_queue_head() (and __skb_unlink below) as
lockless variants.

> +
> +       return skb->data;
> +}
> +
> +static void mctp_pcc_client_rx_callback(struct mbox_client *c, void *buffer)
> +{
> +       struct mctp_pcc_ndev *mctp_pcc_ndev;
> +       struct pcc_header pcc_header;
> +       struct sk_buff *skb = NULL;
> +       struct mctp_skb_cb *cb;
> +
> +       mctp_pcc_ndev = container_of(c, struct mctp_pcc_ndev, inbox.client);
> +       if (!buffer) {
> +               dev_dstats_rx_dropped(mctp_pcc_ndev->ndev);
> +               return;
> +       }
> +
> +       spin_lock(&mctp_pcc_ndev->lock);
> +       skb_queue_walk(&mctp_pcc_ndev->inbox.packets, skb) {
> +               if (skb->data != buffer)
> +                       continue;
> +               skb_unlink(skb, &mctp_pcc_ndev->inbox.packets);
> +               break;
> +       }
> +       spin_unlock(&mctp_pcc_ndev->lock);
> +
> +       if (skb) {

The termination case of skb_queue_walk() will not leave skb as null.

Instead, you will probably want to use a temporary variable for the
iterator, and set skb when you find a match.

> +               dev_dstats_rx_add(mctp_pcc_ndev->ndev, skb->len);
> +               skb_reset_mac_header(skb);
> +               skb_pull(skb, sizeof(pcc_header));
> +               skb_reset_network_header(skb);
> +               cb = __mctp_cb(skb);
> +               cb->halen = 0;
> +               netif_rx(skb);
> +       }
> +}
> +
> +static void mctp_pcc_tx_done(struct mbox_client *c, void *mssg, int r)
> +{
> +       struct mctp_pcc_ndev *mctp_pcc_ndev;
> +       struct mctp_pcc_mailbox *box;
> +       struct sk_buff *skb = NULL;
> +
> +       mctp_pcc_ndev = container_of(c, struct mctp_pcc_ndev, outbox.client);
> +       box = container_of(c, struct mctp_pcc_mailbox, client);
> +       spin_lock(&mctp_pcc_ndev->lock);
> +       skb_queue_walk(&box->packets, skb) {
> +               if (skb->data == mssg) {
> +                       skb_unlink(skb, &box->packets);
> +                       break;
> +               }
> +       }
> +       spin_unlock(&mctp_pcc_ndev->lock);
> +
> +       if (skb)

Same as above; this will not be null in the not-found case, but will
point to the list head.

> +               dev_consume_skb_any(skb);

... but even if so, you could pass skb as NULL here and avoid the if.

> +}
> +
> +static netdev_tx_t mctp_pcc_tx(struct sk_buff *skb, struct net_device *ndev)
> +{
> +       struct mctp_pcc_ndev *mpnd = netdev_priv(ndev);
> +       struct pcc_header *pcc_header;
> +       int len = skb->len;
> +       int rc;
> +
> +       rc = skb_cow_head(skb, sizeof(*pcc_header));
> +       if (rc) {
> +               dev_dstats_tx_dropped(ndev);
> +               kfree_skb(skb);
> +               return NETDEV_TX_OK;
> +       }
> +
> +       pcc_header = skb_push(skb, sizeof(*pcc_header));
> +       pcc_header->signature = PCC_SIGNATURE | mpnd->outbox.index;
> +       pcc_header->flags = PCC_CMD_COMPLETION_NOTIFY;
> +       memcpy(&pcc_header->command, MCTP_SIGNATURE, MCTP_SIGNATURE_LENGTH);
> +       pcc_header->length = len + MCTP_SIGNATURE_LENGTH;
> +
> +       spin_lock(&mpnd->lock);
> +       skb_queue_head(&mpnd->outbox.packets, skb);
> +       spin_unlock(&mpnd->lock);
> +
> +       rc = mbox_send_message(mpnd->outbox.chan->mchan, skb->data);
> +
> +       if (rc < 0) {
> +               skb_unlink(skb, &mpnd->outbox.packets);

This does not hold the lock you are using for the list traversal. If you
want to rely on the list-internal locking, that's fine, but you need to
be consistent in your approach.

> +               return NETDEV_TX_BUSY;
> +       }
> +
> +       dev_dstats_tx_add(ndev, len);
> +       return NETDEV_TX_OK;
> +}
> +
> +static void drain_packets(struct sk_buff_head *list)
> +{
> +       struct sk_buff *skb;
> +
> +       while (!skb_queue_empty(list)) {
> +               skb = skb_dequeue(list);
> +               dev_consume_skb_any(skb);
> +       }
> +}
> +
> +static int mctp_pcc_ndo_open(struct net_device *ndev)
> +{
> +       struct mctp_pcc_ndev *mctp_pcc_ndev =
> +           netdev_priv(ndev);
> +       struct mctp_pcc_mailbox *outbox =
> +           &mctp_pcc_ndev->outbox;
> +       struct mctp_pcc_mailbox *inbox =
> +           &mctp_pcc_ndev->inbox;

Minor: I don't think these need wrapping?

> +       int mctp_pcc_mtu;
> +
> +       outbox->chan = pcc_mbox_request_channel(&outbox->client, outbox->index);
> +       if (IS_ERR(outbox->chan))
> +               return PTR_ERR(outbox->chan);
> +
> +       inbox->chan = pcc_mbox_request_channel(&inbox->client, inbox->index);
> +       if (IS_ERR(inbox->chan)) {
> +               pcc_mbox_free_channel(outbox->chan);
> +               return PTR_ERR(inbox->chan);
> +       }
> +
> +       mctp_pcc_ndev->inbox.chan->rx_alloc = mctp_pcc_rx_alloc;
> +       mctp_pcc_ndev->outbox.chan->manage_writes = true;
> +
> +       /* There is no clean way to pass the MTU to the callback function
> +        * used for registration, so set the values ahead of time.
> +        */

For my own clarity, what's "the callback function used for
registration"?

> +       mctp_pcc_mtu = mctp_pcc_ndev->outbox.chan->shmem_size -
> +               sizeof(struct pcc_header);
> +       ndev->mtu = MCTP_MIN_MTU;
> +       ndev->max_mtu = mctp_pcc_mtu;
> +       ndev->min_mtu = MCTP_MIN_MTU;
> +
> +       return 0;
> +}
> +
> +static int mctp_pcc_ndo_stop(struct net_device *ndev)
> +{
> +       struct mctp_pcc_ndev *mctp_pcc_ndev =
> +           netdev_priv(ndev);
> +       struct mctp_pcc_mailbox *outbox =
> +           &mctp_pcc_ndev->outbox;
> +       struct mctp_pcc_mailbox *inbox =
> +           &mctp_pcc_ndev->inbox;

No wrapping needed here either.

Cheers,


Jeremy

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next v25 1/1] mctp pcc: Implement MCTP over PCC Transport
  2025-08-22  8:21   ` Jeremy Kerr
@ 2025-08-26 20:51     ` Adam Young
  2025-08-27  1:37       ` Jeremy Kerr
  2025-08-26 22:37     ` Adam Young
  1 sibling, 1 reply; 14+ messages in thread
From: Adam Young @ 2025-08-26 20:51 UTC (permalink / raw)
  To: Jeremy Kerr, admiyo, Matt Johnston, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: netdev, linux-kernel, Sudeep Holla, Jonathan Cameron, Huisong Li

In addition to the below comment, I am removing the additional lock on 
the skb lists and using the internal one for all operations.  It leads 
to leaner and cleaner code.

Updated driver shortly.


On 8/22/25 04:21, Jeremy Kerr wrote:
>> +       mctp_pcc_ndev->inbox.chan->rx_alloc = mctp_pcc_rx_alloc;
>> +       mctp_pcc_ndev->outbox.chan->manage_writes = true;
>> +
>> +       /* There is no clean way to pass the MTU to the callback function
>> +        * used for registration, so set the values ahead of time.
>> +        */
> For my own clarity, what's "the callback function used for
> registration"?


Actually, this is not longer true: we can do it in ndo_open, and it is 
clean.  Removed the comment.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next v25 1/1] mctp pcc: Implement MCTP over PCC Transport
  2025-08-22  8:21   ` Jeremy Kerr
  2025-08-26 20:51     ` Adam Young
@ 2025-08-26 22:37     ` Adam Young
  1 sibling, 0 replies; 14+ messages in thread
From: Adam Young @ 2025-08-26 22:37 UTC (permalink / raw)
  To: Jeremy Kerr, admiyo, Matt Johnston, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: netdev, linux-kernel, Sudeep Holla, Jonathan Cameron, Huisong Li


On 8/22/25 04:21, Jeremy Kerr wrote:
>> +static int mctp_pcc_ndo_open(struct net_device *ndev)
>> +{
>> +       struct mctp_pcc_ndev *mctp_pcc_ndev =
>> +           netdev_priv(ndev);
>> +       struct mctp_pcc_mailbox *outbox =
>> +           &mctp_pcc_ndev->outbox;
>> +       struct mctp_pcc_mailbox *inbox =
>> +           &mctp_pcc_ndev->inbox;
> Minor: I don't think these need wrapping?

The outbox and inbox lines are longer than the mctp_pcc_ndev line, and 
they depend on it.  This ordering and wrapping passes the xmas  tree 
check and keeps assignment with declaration.



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next v25 1/1] mctp pcc: Implement MCTP over PCC Transport
  2025-08-26 20:51     ` Adam Young
@ 2025-08-27  1:37       ` Jeremy Kerr
  2025-08-27  4:55         ` Adam Young
  0 siblings, 1 reply; 14+ messages in thread
From: Jeremy Kerr @ 2025-08-27  1:37 UTC (permalink / raw)
  To: Adam Young, admiyo, Matt Johnston, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: netdev, linux-kernel, Sudeep Holla, Jonathan Cameron, Huisong Li

Hi Adam,

> In addition to the below comment, I am removing the additional lock on
> the skb lists and using the internal one for all operations.  It leads
> to leaner and cleaner code.

Ok, neat!

Just be careful with locking as you're iterating the queues. Your
current approach of doing the drain under one lock acquire is probably
the best, if you can do the same with the queue-internal locking.

> > > +       mctp_pcc_ndev->inbox.chan->rx_alloc = mctp_pcc_rx_alloc;
> > > +       mctp_pcc_ndev->outbox.chan->manage_writes = true;
> > > +
> > > +       /* There is no clean way to pass the MTU to the callback function
> > > +        * used for registration, so set the values ahead of time.
> > > +        */
> > For my own clarity, what's "the callback function used for
> > registration"?
> 
> 
> Actually, this is not longer true: we can do it in ndo_open, and it
> is clean.  Removed the comment.

OK, The current patch *does* do it in ndo_open though, hence my
confusion.

From your other reply:

> > > +static int mctp_pcc_ndo_open(struct net_device *ndev)
> > > +{
> > > +       struct mctp_pcc_ndev *mctp_pcc_ndev =
> > > +           netdev_priv(ndev);
> > > +       struct mctp_pcc_mailbox *outbox =
> > > +           &mctp_pcc_ndev->outbox;
> > > +       struct mctp_pcc_mailbox *inbox =
> > > +           &mctp_pcc_ndev->inbox;
> > Minor: I don't think these need wrapping?
> 
> The outbox and inbox lines are longer than the mctp_pcc_ndev line, and
> they depend on it.  This ordering and wrapping passes the xmas  tree
> check and keeps assignment with declaration.

If you need a specific ordering for actual correctness, no need to
force that into a reverse-christmas-tree. Spacing requirements like that
are secondary.

The only remaining query I had was the TX flow control. You're returning
NETDEV_TX_BUSY while the queues are still running, so are likely to get
repeated TX in a loop there.

Cheers,


Jeremy

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH net-next v25 0/1] MCTP Over PCC Transport
@ 2025-08-27  4:48 admiyo
  2025-08-27  4:48 ` [PATCH net-next v25 1/1] mctp pcc: Implement MCTP over " admiyo
  2025-08-27  4:57 ` [PATCH net-next v25 0/1] MCTP Over " Adam Young
  0 siblings, 2 replies; 14+ messages in thread
From: admiyo @ 2025-08-27  4:48 UTC (permalink / raw)
  Cc: netdev, linux-kernel, Jeremy Kerr, Matt Johnston,
	David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Sudeep Holla, Jonathan Cameron, Huisong Li

From: Adam Young <admiyo@os.amperecomputing.com>

This series adds support for the Management Control Transport Protocol (MCTP)
over the Platform Communication Channel (PCC) mechanism.

DMTF DSP:0292
https://www.dmtf.org/sites/default/files/standards/documents/DSP0292_1.0.0WIP50.pdf

MCTP defines a communication model intended to
facilitate communication between Management controllers
and other management controllers, and between Management
controllers and management devices

PCC is a mechanism for communication between components within
the  Platform.  It is a composed of shared memory regions,
interrupt registers, and status registers.

The MCTP over PCC driver makes use of two PCC channels. For
sending messages, it uses a Type 3 channel, and for receiving
messages it uses the paired Type 4 channel.  The device
and corresponding channels are specified via ACPI.

The first patch in the series implements a mechanism to allow the driver
to indicate whether an ACK should be sent back to the caller
after processing the interrupt.  This is an optional feature in
the PCC code, but has been made explicitly required in another driver.
The implementation here maintains the backwards compatibility of that
driver.

MCTP is a general purpose  protocol so  it would  be impossible to enumerate
all the use cases, but some of the ones that are most topical are attestation
and RAS support.  There are a handful of protocols built on top of MCTP, to
include PLDM and SPDM, both specified by the DMTF.

https://www.dmtf.org/sites/default/files/standards/documents/DSP0240_1.0.0.pdf
https://www.dmtf.org/sites/default/files/standards/documents/DSP0274_1.3.0.pd

SPDM entails various usages, including device identity collection, device
authentication, measurement collection, and device secure session establishment.

PLDM is more likely to be used  for hardware support: temperature, voltage, or
fan sensor control.

At least two companies have devices that can make use of the mechanism. One is
Ampere Computing, my employer.

The mechanism it uses is called Platform Communication Channels is part of the
ACPI spec: https://uefi.org/htmlspecs/ACPI_Spec_6_4_html/14_Platform_Communications_Channel/Platform_Comm_Channel.html

Since it is a socket interface, the system administrator also has  the ability
to ignore an MCTP link that they do not want to enable.  This link would be visible
to the end user, but would not be usable.

If MCTP support is disabled in the Kernel, this driver would also be disabled.

PCC is based on a shared buffer and a set of I/O mapped memory locations that the
Spec calls registers.  This mechanism exists regardless of the existence of the
driver. Thus, if the user has the ability to map these  physical location to
virtual locations, they have the ability to drive the hardware.  Thus, there
is a security aspect to this mechanism that extends beyond the responsibilities
of the operating system.

If the hardware does not expose the PCC in the ACPI table, this device will never
be enabled.  Thus it is only an issue on hard that does support PCC.  In that case,
it is up to the remote controller to sanitize communication; MCTP will be exposed
as a socket interface, and userland can send any crafted packet it wants.  It would
thus also be incumbent on the hardware manufacturer to allow the end user to disable
MCTP over PCC communication if they did not want to expose it.

Previous implementations of the pcc version of the mailbox protocol assumed the
driver was directly managing the shared memory region.  This lead to duplicated
code and missed steps of the PCC protocol. The first patch in this series makes
it possible for mailbox/pcc to manage the writing of the buffer prior to sending
messages.  It also fixes the notification of message transmission completion.

Previous Version:
https://lore.kernel.org/lkml/20250819205159.347561-1-admiyo@os.amperecomputing.com/

Changes in V26:
-  Remove the addition net-device spinlock and use the spinlock already present in skb lists
-  Use temporary variables to check for success finding the skb in the lists
-  Remove comment that is no longer relevant

Changes in V25:
- Use spin lock to control access to queues of sk_buffs
- removed unused constants
- added ndo_open and ndo_stop functions.  These two functions do
  channel creation and cleanup, to remove packets from the mailbox.
  They do queue cleanup as well.
- No longer cleans up the channel from the device.

Changes in V24:
- Removed endianess for PCC header values
- Kept Column width to under 80 chars
- Typo in commit message
- Prereqisite patch for PCC buffer management was merged late in 6.17.
  See "mailbox/pcc: support mailbox management of the shared buffer"

Changes in V23:
- Trigger for direct management of shared buffer based on flag in pcc channel
- Only initialize rx_alloc for inbox, not outbox.
- Read value for requested IRQ flag out of channel's current_req
- unqueue an sk_buff that failed to send
- Move error handling for skb resize error inline instead of goto

Changes in V22:
- Direct management of the shared buffer in the mailbox layer.
- Proper checking of command complete flag prior to writing to the buffer.

Changes in V21:
- Use existing constants PCC_SIGNATURE and PCC_CMD_COMPLETION_NOTIFY
- Check return code on call to send_data and drop packet if failed
- use sizeof(*mctp_pcc_header) etc,  instead of structs for resizing buffers
- simplify check for ares->type != PCC_DWORD_TYPE
- simply return result devm_add_action_or_reset
- reduce initializer for  mctp_pcc_lookup_context context = {};
- move initialization of mbox dev into mctp_pcc_initialize_mailbox
- minor spacing changes

Changes in V20:
- corrected typo in RFC version
- removed spurious space
- tx spin lock only controls access to shared memory buffer
- tx spin lock not eheld on error condition
- tx returns OK if skb can't be expanded

Changes in V19:
- Rebased on changes to PCC mailbox handling
- checks for cloned SKB prior to transmission
- converted doulbe slash comments to C comments

Changes in V18:
- Added Acked-By
- Fix minor spacing issue

Changes in V17:
- No new changes. Rebased on net-next post 6.13 release.

Changes in V16:
- do not duplicate cleanup after devm_add_action_or_reset calls

Changes in V15:
- corrected indentation formatting error
- Corrected TABS issue in MAINTAINER entry

Changes in V14:
- Do not attempt to unregister a netdev that is never registered
- Added MAINTAINER entry

Changes in V13:
- Explicitly Convert PCC header from little endian to machine native

Changes in V12:
- Explicitly use little endian conversion for PCC header signature
- Builds clean with make C=1

Changes in V11:
- Explicitly use little endian types for PCC header

Changes in V11:
- Switch Big Endian data types to machine local for PCC header
- use mctp specific function for registering netdev

Changes in V10:
- sync with net-next branch
- use dstats helper functions
- remove duplicate drop stat
- remove more double spaces

Changes in V9:
- Prerequisite patch for PCC mailbox has been merged
- Stats collection now use helper functions
- many double spaces reduced to single

Changes in V8:
- change 0 to NULL for pointer check of shmem
- add semi for static version of pcc_mbox_ioremap
- convert pcc_mbox_ioremap function to static inline when client code is not being built
- remove shmem comment from struct pcc_chan_info descriptor
- copy rx_dropped in mctp_pcc_net_stats
- removed trailing newline on error message
- removed double space in dev_dbg string
- use big endian for header members
- Fix use full spec ID in description
- Fix typo in file description
- Form the complete outbound message in the sk_buff

Changes in V7:
- Removed the Hardware address as specification is not published.
- Map the shared buffer in the mailbox and share the mapped region with the driver
- Use the sk_buff memory to prepare the message before copying to shared region

Changes in V6:
- Removed patch for ACPICA code that has merged
- Includes the hardware address in the network device
- Converted all device resources to devm resources
- Removed mctp_pcc_driver_remove function
- uses acpi_driver_module for initialization
- created helper structure for in and out mailboxes
- Consolidated code for initializing mailboxes in the add_device function
- Added specification references
- Removed duplicate constant PCC_ACK_FLAG_MASK
- Use the MCTP_SIGNATURE_LENGTH define
- made naming of header structs consistent
- use sizeof local variables for offset calculations
- prefix structure name to avoid potential clash
- removed unnecessary null initialization from acpi_device_id

Changes in V5
- Removed Owner field from ACPI module declaration
- removed unused next field from struct mctp_pcc_ndev
- Corrected logic reading  RX ACK flag.
- Added comment for struct pcc_chan_info field shmem_base_addr
- check against current mtu instead of max mtu for packet length\
- removed unnecessary lookups of pnd->mdev.dev

Changes in V4
- Read flags out of shared buffer to trigger ACK for Type 4 RX
- Remove list of netdevs and cleanup from devices only
- tag PCCT protocol headers as little endian
- Remove unused constants

Changes in V3
- removed unused header
- removed spurious space
- removed spurious semis after functiomns
- removed null assignment for init
- remove redundant set of device on skb
- tabify constant declarations
- added  rtnl_link_stats64 function
- set MTU to minimum to start
- clean up logic on driver removal
- remove cast on void * assignment
- call cleanup function directly
- check received length before allocating skb
- introduce symbolic constatn for ACK FLAG MASK
- symbolic constant for PCC header flag.
- Add namespace ID to PCC magic
- replaced readls with copy from io of PCC header
- replaced custom modules init and cleanup with ACPI version

Changes in V2

- All Variable Declarations are in reverse Xmass Tree Format
- All Checkpatch Warnings Are Fixed
- Removed Dead code
- Added packet tx/rx stats
- Removed network physical address.  This is still in
  disucssion in the spec, and will be added once there
  is consensus. The protocol can be used with out it.
  This also lead to the removal of the Big Endian
  conversions.
- Avoided using non volatile pointers in copy to and from io space
- Reorderd the patches to put the ACK check for the PCC Mailbox
  as a pre-requisite.  The corresponding change for the MCTP
  driver has been inlined in the main patch.
- Replaced magic numbers with constants, fixed typos, and other
  minor changes from code review.

Adam Young (1):
  mctp pcc: Implement MCTP over PCC Transport

 MAINTAINERS                 |   5 +
 drivers/net/mctp/Kconfig    |  13 ++
 drivers/net/mctp/Makefile   |   1 +
 drivers/net/mctp/mctp-pcc.c | 367 ++++++++++++++++++++++++++++++++++++
 4 files changed, 386 insertions(+)
 create mode 100644 drivers/net/mctp/mctp-pcc.c

-- 
2.43.0


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH net-next v25 1/1] mctp pcc: Implement MCTP over PCC Transport
  2025-08-27  4:48 [PATCH net-next v25 0/1] MCTP Over PCC Transport admiyo
@ 2025-08-27  4:48 ` admiyo
  2025-08-27  7:36   ` Jeremy Kerr
                     ` (3 more replies)
  2025-08-27  4:57 ` [PATCH net-next v25 0/1] MCTP Over " Adam Young
  1 sibling, 4 replies; 14+ messages in thread
From: admiyo @ 2025-08-27  4:48 UTC (permalink / raw)
  To: Jeremy Kerr, Matt Johnston, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Adam Young
  Cc: netdev, linux-kernel, Sudeep Holla, Jonathan Cameron, Huisong Li

From: Adam Young <admiyo@os.amperecomputing.com>

Implementation of network driver for
Management Control Transport Protocol(MCTP)
over Platform Communication Channel(PCC)

DMTF DSP:0292
https://www.dmtf.org/sites/default/files/standards/documents/\
DSP0292_1.0.0WIP50.pdf

MCTP devices are specified via ACPI by entries
in DSDT/SDST and reference channels specified
in the PCCT.  Messages are sent on a type 3 and
received on a type 4 channel.  Communication with
other devices use the PCC based doorbell mechanism;
a shared memory segment with a corresponding
interrupt and a memory register used to trigger
remote interrupts.

This driver takes advantage of PCC mailbox buffer
management. The data section of the struct sk_buff
that contains the outgoing packet is sent to the mailbox,
already properly formatted  as a PCC message.  The driver
is also responsible for allocating a struct sk_buff that
is then passed to the mailbox and used to record the
data in the shared buffer. It maintains a list of both
outging and incoming sk_buffs to match the data buffers

When the Type 3 channel outbox receives a txdone response
interrupt, it consumes the outgoing sk_buff, allowing
it to be freed.

Bringing the interface up and down creates and frees
the channel between the network driver and the mailbox
driver. Freeing the channel also frees any packets that
are cached in the mailbox ringbuffer.

Signed-off-by: Adam Young <admiyo@os.amperecomputing.com>
---
 MAINTAINERS                 |   5 +
 drivers/net/mctp/Kconfig    |  13 ++
 drivers/net/mctp/Makefile   |   1 +
 drivers/net/mctp/mctp-pcc.c | 367 ++++++++++++++++++++++++++++++++++++
 4 files changed, 386 insertions(+)
 create mode 100644 drivers/net/mctp/mctp-pcc.c

diff --git a/MAINTAINERS b/MAINTAINERS
index bce96dd254b8..de359bddcb2f 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -14660,6 +14660,11 @@ F:	include/net/mctpdevice.h
 F:	include/net/netns/mctp.h
 F:	net/mctp/
 
+MANAGEMENT COMPONENT TRANSPORT PROTOCOL (MCTP) over PCC (MCTP-PCC) Driver
+M:	Adam Young <admiyo@os.amperecomputing.com>
+S:	Maintained
+F:	drivers/net/mctp/mctp-pcc.c
+
 MAPLE TREE
 M:	Liam R. Howlett <Liam.Howlett@oracle.com>
 L:	maple-tree@lists.infradead.org
diff --git a/drivers/net/mctp/Kconfig b/drivers/net/mctp/Kconfig
index cf325ab0b1ef..f69d0237f058 100644
--- a/drivers/net/mctp/Kconfig
+++ b/drivers/net/mctp/Kconfig
@@ -57,6 +57,19 @@ config MCTP_TRANSPORT_USB
 	  MCTP-over-USB interfaces are peer-to-peer, so each interface
 	  represents a physical connection to one remote MCTP endpoint.
 
+config MCTP_TRANSPORT_PCC
+	tristate "MCTP PCC transport"
+	depends on ACPI
+	help
+	  Provides a driver to access MCTP devices over PCC transport,
+	  A MCTP protocol network device is created via ACPI for each
+	  entry in the DST/SDST that matches the identifier. The Platform
+	  communication channels are selected from the corresponding
+	  entries in the PCCT.
+
+	  Say y here if you need to connect to MCTP endpoints over PCC. To
+	  compile as a module, use m; the module will be called mctp-pcc.
+
 endmenu
 
 endif
diff --git a/drivers/net/mctp/Makefile b/drivers/net/mctp/Makefile
index c36006849a1e..2276f148df7c 100644
--- a/drivers/net/mctp/Makefile
+++ b/drivers/net/mctp/Makefile
@@ -1,3 +1,4 @@
+obj-$(CONFIG_MCTP_TRANSPORT_PCC) += mctp-pcc.o
 obj-$(CONFIG_MCTP_SERIAL) += mctp-serial.o
 obj-$(CONFIG_MCTP_TRANSPORT_I2C) += mctp-i2c.o
 obj-$(CONFIG_MCTP_TRANSPORT_I3C) += mctp-i3c.o
diff --git a/drivers/net/mctp/mctp-pcc.c b/drivers/net/mctp/mctp-pcc.c
new file mode 100644
index 000000000000..c6578b27c00c
--- /dev/null
+++ b/drivers/net/mctp/mctp-pcc.c
@@ -0,0 +1,367 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * mctp-pcc.c - Driver for MCTP over PCC.
+ * Copyright (c) 2024-2025, Ampere Computing LLC
+ *
+ */
+
+/* Implementation of MCTP over PCC DMTF Specification DSP0256
+ * https://www.dmtf.org/sites/default/files/standards/documents/DSP0292_1.0.0WIP50.pdf
+ */
+
+#include <linux/acpi.h>
+#include <linux/if_arp.h>
+#include <linux/init.h>
+#include <linux/kernel.h>
+#include <linux/mailbox_client.h>
+#include <linux/module.h>
+#include <linux/netdevice.h>
+#include <linux/platform_device.h>
+#include <linux/string.h>
+#include <linux/skbuff.h>
+#include <linux/hrtimer.h>
+
+#include <acpi/acpi_bus.h>
+#include <acpi/acpi_drivers.h>
+#include <acpi/acrestyp.h>
+#include <acpi/actbl.h>
+#include <net/mctp.h>
+#include <net/mctpdevice.h>
+#include <acpi/pcc.h>
+
+#define MCTP_SIGNATURE          "MCTP"
+#define MCTP_SIGNATURE_LENGTH   (sizeof(MCTP_SIGNATURE) - 1)
+#define MCTP_MIN_MTU            68
+#define PCC_DWORD_TYPE          0x0c
+
+struct mctp_pcc_mailbox {
+	u32 index;
+	struct pcc_mbox_chan *chan;
+	struct mbox_client client;
+	struct sk_buff_head packets;
+};
+
+/* The netdev structure. One of these per PCC adapter. */
+struct mctp_pcc_ndev {
+	struct net_device *ndev;
+	struct acpi_device *acpi_device;
+	struct mctp_pcc_mailbox inbox;
+	struct mctp_pcc_mailbox outbox;
+};
+
+static void *mctp_pcc_rx_alloc(struct mbox_client *c, int size)
+{
+	struct mctp_pcc_ndev *mctp_pcc_ndev;
+	struct mctp_pcc_mailbox *box;
+	struct sk_buff *skb;
+
+	mctp_pcc_ndev =	container_of(c, struct mctp_pcc_ndev, inbox.client);
+	box = &mctp_pcc_ndev->inbox;
+
+	if (size > mctp_pcc_ndev->ndev->mtu)
+		return NULL;
+	skb = netdev_alloc_skb(mctp_pcc_ndev->ndev, size);
+	if (!skb)
+		return NULL;
+	skb_put(skb, size);
+	skb->protocol = htons(ETH_P_MCTP);
+	skb_queue_head(&box->packets, skb);
+
+	return skb->data;
+}
+
+static void mctp_pcc_client_rx_callback(struct mbox_client *c, void *buffer)
+{
+	struct mctp_pcc_ndev *mctp_pcc_ndev;
+	struct sk_buff *curr_skb = NULL;
+	struct pcc_header pcc_header;
+	struct sk_buff *skb = NULL;
+	struct mctp_skb_cb *cb;
+
+	mctp_pcc_ndev = container_of(c, struct mctp_pcc_ndev, inbox.client);
+	if (!buffer) {
+		dev_dstats_rx_dropped(mctp_pcc_ndev->ndev);
+		return;
+	}
+
+	spin_lock(&mctp_pcc_ndev->inbox.packets.lock);
+	skb_queue_walk(&mctp_pcc_ndev->inbox.packets, curr_skb) {
+		skb = curr_skb;
+		if (skb->data != buffer)
+			continue;
+		__skb_unlink(skb, &mctp_pcc_ndev->inbox.packets);
+		break;
+	}
+	spin_unlock(&mctp_pcc_ndev->inbox.packets.lock);
+
+	if (skb) {
+		dev_dstats_rx_add(mctp_pcc_ndev->ndev, skb->len);
+		skb_reset_mac_header(skb);
+		skb_pull(skb, sizeof(pcc_header));
+		skb_reset_network_header(skb);
+		cb = __mctp_cb(skb);
+		cb->halen = 0;
+		netif_rx(skb);
+	}
+}
+
+static void mctp_pcc_tx_done(struct mbox_client *c, void *mssg, int r)
+{
+	struct mctp_pcc_ndev *mctp_pcc_ndev;
+	struct mctp_pcc_mailbox *box;
+	struct sk_buff *skb = NULL;
+	struct sk_buff *curr_skb;
+
+	mctp_pcc_ndev = container_of(c, struct mctp_pcc_ndev, outbox.client);
+	box = container_of(c, struct mctp_pcc_mailbox, client);
+	spin_lock(&box->packets.lock);
+	skb_queue_walk(&box->packets, curr_skb) {
+		skb = curr_skb;
+		if (skb->data == mssg) {
+			__skb_unlink(skb, &box->packets);
+			break;
+		}
+	}
+	spin_unlock(&box->packets.lock);
+
+	if (skb)
+		dev_consume_skb_any(skb);
+}
+
+static netdev_tx_t mctp_pcc_tx(struct sk_buff *skb, struct net_device *ndev)
+{
+	struct mctp_pcc_ndev *mpnd = netdev_priv(ndev);
+	struct pcc_header *pcc_header;
+	int len = skb->len;
+	int rc;
+
+	rc = skb_cow_head(skb, sizeof(*pcc_header));
+	if (rc) {
+		dev_dstats_tx_dropped(ndev);
+		kfree_skb(skb);
+		return NETDEV_TX_OK;
+	}
+
+	pcc_header = skb_push(skb, sizeof(*pcc_header));
+	pcc_header->signature = PCC_SIGNATURE | mpnd->outbox.index;
+	pcc_header->flags = PCC_CMD_COMPLETION_NOTIFY;
+	memcpy(&pcc_header->command, MCTP_SIGNATURE, MCTP_SIGNATURE_LENGTH);
+	pcc_header->length = len + MCTP_SIGNATURE_LENGTH;
+
+	skb_queue_head(&mpnd->outbox.packets, skb);
+
+	rc = mbox_send_message(mpnd->outbox.chan->mchan, skb->data);
+
+	if (rc < 0) {
+		skb_unlink(skb, &mpnd->outbox.packets);
+		return NETDEV_TX_BUSY;
+	}
+
+	dev_dstats_tx_add(ndev, len);
+	return NETDEV_TX_OK;
+}
+
+static void drain_packets(struct sk_buff_head *list)
+{
+	struct sk_buff *skb;
+
+	while (!skb_queue_empty(list)) {
+		skb = skb_dequeue(list);
+		dev_consume_skb_any(skb);
+	}
+}
+
+static int mctp_pcc_ndo_open(struct net_device *ndev)
+{
+	struct mctp_pcc_ndev *mctp_pcc_ndev =
+	    netdev_priv(ndev);
+	struct mctp_pcc_mailbox *outbox =
+	    &mctp_pcc_ndev->outbox;
+	struct mctp_pcc_mailbox *inbox =
+	    &mctp_pcc_ndev->inbox;
+	int mctp_pcc_mtu;
+
+	outbox->chan = pcc_mbox_request_channel(&outbox->client, outbox->index);
+	if (IS_ERR(outbox->chan))
+		return PTR_ERR(outbox->chan);
+
+	inbox->chan = pcc_mbox_request_channel(&inbox->client, inbox->index);
+	if (IS_ERR(inbox->chan)) {
+		pcc_mbox_free_channel(outbox->chan);
+		return PTR_ERR(inbox->chan);
+	}
+
+	mctp_pcc_ndev->inbox.chan->rx_alloc = mctp_pcc_rx_alloc;
+	mctp_pcc_ndev->outbox.chan->manage_writes = true;
+
+	mctp_pcc_mtu = mctp_pcc_ndev->outbox.chan->shmem_size -
+		sizeof(struct pcc_header);
+	ndev->mtu = MCTP_MIN_MTU;
+	ndev->max_mtu = mctp_pcc_mtu;
+	ndev->min_mtu = MCTP_MIN_MTU;
+
+	return 0;
+}
+
+static int mctp_pcc_ndo_stop(struct net_device *ndev)
+{
+	struct mctp_pcc_ndev *mctp_pcc_ndev =
+	    netdev_priv(ndev);
+	struct mctp_pcc_mailbox *outbox =
+	    &mctp_pcc_ndev->outbox;
+	struct mctp_pcc_mailbox *inbox =
+	    &mctp_pcc_ndev->inbox;
+
+	pcc_mbox_free_channel(outbox->chan);
+	pcc_mbox_free_channel(inbox->chan);
+
+	drain_packets(&mctp_pcc_ndev->outbox.packets);
+	drain_packets(&mctp_pcc_ndev->inbox.packets);
+	return 0;
+}
+
+static const struct net_device_ops mctp_pcc_netdev_ops = {
+	.ndo_open = mctp_pcc_ndo_open,
+	.ndo_stop = mctp_pcc_ndo_stop,
+	.ndo_start_xmit = mctp_pcc_tx,
+
+};
+
+static void mctp_pcc_setup(struct net_device *ndev)
+{
+	ndev->type = ARPHRD_MCTP;
+	ndev->hard_header_len = 0;
+	ndev->tx_queue_len = 0;
+	ndev->flags = IFF_NOARP;
+	ndev->netdev_ops = &mctp_pcc_netdev_ops;
+	ndev->needs_free_netdev = true;
+	ndev->pcpu_stat_type = NETDEV_PCPU_STAT_DSTATS;
+}
+
+struct mctp_pcc_lookup_context {
+	int index;
+	u32 inbox_index;
+	u32 outbox_index;
+};
+
+static acpi_status lookup_pcct_indices(struct acpi_resource *ares,
+				       void *context)
+{
+	struct mctp_pcc_lookup_context *luc = context;
+	struct acpi_resource_address32 *addr;
+
+	if (ares->type != PCC_DWORD_TYPE)
+		return AE_OK;
+
+	addr = ACPI_CAST_PTR(struct acpi_resource_address32, &ares->data);
+	switch (luc->index) {
+	case 0:
+		luc->outbox_index = addr[0].address.minimum;
+		break;
+	case 1:
+		luc->inbox_index = addr[0].address.minimum;
+		break;
+	}
+	luc->index++;
+	return AE_OK;
+}
+
+static void mctp_cleanup_netdev(void *data)
+{
+	struct net_device *ndev = data;
+
+	mctp_unregister_netdev(ndev);
+}
+
+static int mctp_pcc_initialize_mailbox(struct device *dev,
+				       struct mctp_pcc_mailbox *box, u32 index)
+{
+	box->index = index;
+	skb_queue_head_init(&box->packets);
+	box->client.dev = dev;
+	return 0;
+}
+
+static int mctp_pcc_driver_add(struct acpi_device *acpi_dev)
+{
+	struct mctp_pcc_lookup_context context = {0};
+	struct mctp_pcc_ndev *mctp_pcc_ndev;
+	struct device *dev = &acpi_dev->dev;
+	struct net_device *ndev;
+	acpi_handle dev_handle;
+	acpi_status status;
+	char name[32];
+	int rc;
+
+	dev_dbg(dev, "Adding mctp_pcc device for HID %s\n",
+		acpi_device_hid(acpi_dev));
+	dev_handle = acpi_device_handle(acpi_dev);
+	status = acpi_walk_resources(dev_handle, "_CRS", lookup_pcct_indices,
+				     &context);
+	if (!ACPI_SUCCESS(status)) {
+		dev_err(dev, "FAILURE to lookup PCC indexes from CRS\n");
+		return -EINVAL;
+	}
+
+	snprintf(name, sizeof(name), "mctpipcc%d", context.inbox_index);
+	ndev = alloc_netdev(sizeof(*mctp_pcc_ndev), name, NET_NAME_PREDICTABLE,
+			    mctp_pcc_setup);
+	if (!ndev)
+		return -ENOMEM;
+
+	mctp_pcc_ndev = netdev_priv(ndev);
+
+	/* inbox initialization */
+	rc = mctp_pcc_initialize_mailbox(dev, &mctp_pcc_ndev->inbox,
+					 context.inbox_index);
+	if (rc)
+		goto free_netdev;
+
+	mctp_pcc_ndev->inbox.client.rx_callback = mctp_pcc_client_rx_callback;
+
+	/* outbox initialization */
+	rc = mctp_pcc_initialize_mailbox(dev, &mctp_pcc_ndev->outbox,
+					 context.outbox_index);
+	if (rc)
+		goto free_netdev;
+
+	mctp_pcc_ndev->outbox.client.tx_done = mctp_pcc_tx_done;
+	mctp_pcc_ndev->acpi_device = acpi_dev;
+	mctp_pcc_ndev->ndev = ndev;
+	acpi_dev->driver_data = mctp_pcc_ndev;
+
+	/* ndev needs to be freed before the iomemory (mapped above) gets
+	 * unmapped,  devm resources get freed in reverse to the order they
+	 * are added.
+	 */
+	rc = mctp_register_netdev(ndev, NULL, MCTP_PHYS_BINDING_PCC);
+	if (rc)
+		goto free_netdev;
+
+	return devm_add_action_or_reset(dev, mctp_cleanup_netdev, ndev);
+free_netdev:
+	free_netdev(ndev);
+	return rc;
+}
+
+static const struct acpi_device_id mctp_pcc_device_ids[] = {
+	{ "DMT0001" },
+	{}
+};
+
+static struct acpi_driver mctp_pcc_driver = {
+	.name = "mctp_pcc",
+	.class = "Unknown",
+	.ids = mctp_pcc_device_ids,
+	.ops = {
+		.add = mctp_pcc_driver_add,
+	},
+};
+
+module_acpi_driver(mctp_pcc_driver);
+
+MODULE_DEVICE_TABLE(acpi, mctp_pcc_device_ids);
+
+MODULE_DESCRIPTION("MCTP PCC ACPI device");
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Adam Young <admiyo@os.amperecomputing.com>");
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next v25 1/1] mctp pcc: Implement MCTP over PCC Transport
  2025-08-27  1:37       ` Jeremy Kerr
@ 2025-08-27  4:55         ` Adam Young
  0 siblings, 0 replies; 14+ messages in thread
From: Adam Young @ 2025-08-27  4:55 UTC (permalink / raw)
  To: Jeremy Kerr, admiyo, Matt Johnston, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: netdev, linux-kernel, Sudeep Holla, Jonathan Cameron, Huisong Li


On 8/26/25 21:37, Jeremy Kerr wrote:
> The only remaining query I had was the TX flow control. You're returning
> NETDEV_TX_BUSY while the queues are still running, so are likely to get
> repeated TX in a loop there.

Sorry, I missed this until just after hitting submit again.

The code currently goes

         if (rc < 0) {
                 skb_unlink(skb, &mpnd->outbox.packets);
                 return NETDEV_TX_BUSY;
         }
Which means the failed-to-send packet is unlinked.  I guess I am unclear 
if this is sufficient to deal with the packet flow control issue or not.

I have not yet had a setup up where I can flood the network with packets 
and see what happens if I fill up the ring buffer.  I think that is the 
most likely failure case that will lead to flow control issues.  If the 
remote side cannot handle packets as fast as they are sent, at some 
point we have to stop sending them.  The mailbox abstraction makes that 
hard to detect;  I think the send_message will hang trying to get the 
lock on the shared buffer, and time out.





^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next v25 0/1] MCTP Over PCC Transport
  2025-08-27  4:48 [PATCH net-next v25 0/1] MCTP Over PCC Transport admiyo
  2025-08-27  4:48 ` [PATCH net-next v25 1/1] mctp pcc: Implement MCTP over " admiyo
@ 2025-08-27  4:57 ` Adam Young
  1 sibling, 0 replies; 14+ messages in thread
From: Adam Young @ 2025-08-27  4:57 UTC (permalink / raw)
  To: admiyo
  Cc: netdev, linux-kernel, Jeremy Kerr, Matt Johnston,
	David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Sudeep Holla, Jonathan Cameron, Huisong Li

Apologies for misnumbering this. It is V26 not V25.  I can resend if 
that is appropriate.


On 8/27/25 00:48, admiyo@os.amperecomputing.com wrote:
> From: Adam Young <admiyo@os.amperecomputing.com>
>
> This series adds support for the Management Control Transport Protocol (MCTP)
> over the Platform Communication Channel (PCC) mechanism.
>
> DMTF DSP:0292
> https://www.dmtf.org/sites/default/files/standards/documents/DSP0292_1.0.0WIP50.pdf
>
> MCTP defines a communication model intended to
> facilitate communication between Management controllers
> and other management controllers, and between Management
> controllers and management devices
>
> PCC is a mechanism for communication between components within
> the  Platform.  It is a composed of shared memory regions,
> interrupt registers, and status registers.
>
> The MCTP over PCC driver makes use of two PCC channels. For
> sending messages, it uses a Type 3 channel, and for receiving
> messages it uses the paired Type 4 channel.  The device
> and corresponding channels are specified via ACPI.
>
> The first patch in the series implements a mechanism to allow the driver
> to indicate whether an ACK should be sent back to the caller
> after processing the interrupt.  This is an optional feature in
> the PCC code, but has been made explicitly required in another driver.
> The implementation here maintains the backwards compatibility of that
> driver.
>
> MCTP is a general purpose  protocol so  it would  be impossible to enumerate
> all the use cases, but some of the ones that are most topical are attestation
> and RAS support.  There are a handful of protocols built on top of MCTP, to
> include PLDM and SPDM, both specified by the DMTF.
>
> https://www.dmtf.org/sites/default/files/standards/documents/DSP0240_1.0.0.pdf
> https://www.dmtf.org/sites/default/files/standards/documents/DSP0274_1.3.0.pd
>
> SPDM entails various usages, including device identity collection, device
> authentication, measurement collection, and device secure session establishment.
>
> PLDM is more likely to be used  for hardware support: temperature, voltage, or
> fan sensor control.
>
> At least two companies have devices that can make use of the mechanism. One is
> Ampere Computing, my employer.
>
> The mechanism it uses is called Platform Communication Channels is part of the
> ACPI spec: https://uefi.org/htmlspecs/ACPI_Spec_6_4_html/14_Platform_Communications_Channel/Platform_Comm_Channel.html
>
> Since it is a socket interface, the system administrator also has  the ability
> to ignore an MCTP link that they do not want to enable.  This link would be visible
> to the end user, but would not be usable.
>
> If MCTP support is disabled in the Kernel, this driver would also be disabled.
>
> PCC is based on a shared buffer and a set of I/O mapped memory locations that the
> Spec calls registers.  This mechanism exists regardless of the existence of the
> driver. Thus, if the user has the ability to map these  physical location to
> virtual locations, they have the ability to drive the hardware.  Thus, there
> is a security aspect to this mechanism that extends beyond the responsibilities
> of the operating system.
>
> If the hardware does not expose the PCC in the ACPI table, this device will never
> be enabled.  Thus it is only an issue on hard that does support PCC.  In that case,
> it is up to the remote controller to sanitize communication; MCTP will be exposed
> as a socket interface, and userland can send any crafted packet it wants.  It would
> thus also be incumbent on the hardware manufacturer to allow the end user to disable
> MCTP over PCC communication if they did not want to expose it.
>
> Previous implementations of the pcc version of the mailbox protocol assumed the
> driver was directly managing the shared memory region.  This lead to duplicated
> code and missed steps of the PCC protocol. The first patch in this series makes
> it possible for mailbox/pcc to manage the writing of the buffer prior to sending
> messages.  It also fixes the notification of message transmission completion.
>
> Previous Version:
> https://lore.kernel.org/lkml/20250819205159.347561-1-admiyo@os.amperecomputing.com/
>
> Changes in V26:
> -  Remove the addition net-device spinlock and use the spinlock already present in skb lists
> -  Use temporary variables to check for success finding the skb in the lists
> -  Remove comment that is no longer relevant
>
> Changes in V25:
> - Use spin lock to control access to queues of sk_buffs
> - removed unused constants
> - added ndo_open and ndo_stop functions.  These two functions do
>    channel creation and cleanup, to remove packets from the mailbox.
>    They do queue cleanup as well.
> - No longer cleans up the channel from the device.
>
> Changes in V24:
> - Removed endianess for PCC header values
> - Kept Column width to under 80 chars
> - Typo in commit message
> - Prereqisite patch for PCC buffer management was merged late in 6.17.
>    See "mailbox/pcc: support mailbox management of the shared buffer"
>
> Changes in V23:
> - Trigger for direct management of shared buffer based on flag in pcc channel
> - Only initialize rx_alloc for inbox, not outbox.
> - Read value for requested IRQ flag out of channel's current_req
> - unqueue an sk_buff that failed to send
> - Move error handling for skb resize error inline instead of goto
>
> Changes in V22:
> - Direct management of the shared buffer in the mailbox layer.
> - Proper checking of command complete flag prior to writing to the buffer.
>
> Changes in V21:
> - Use existing constants PCC_SIGNATURE and PCC_CMD_COMPLETION_NOTIFY
> - Check return code on call to send_data and drop packet if failed
> - use sizeof(*mctp_pcc_header) etc,  instead of structs for resizing buffers
> - simplify check for ares->type != PCC_DWORD_TYPE
> - simply return result devm_add_action_or_reset
> - reduce initializer for  mctp_pcc_lookup_context context = {};
> - move initialization of mbox dev into mctp_pcc_initialize_mailbox
> - minor spacing changes
>
> Changes in V20:
> - corrected typo in RFC version
> - removed spurious space
> - tx spin lock only controls access to shared memory buffer
> - tx spin lock not eheld on error condition
> - tx returns OK if skb can't be expanded
>
> Changes in V19:
> - Rebased on changes to PCC mailbox handling
> - checks for cloned SKB prior to transmission
> - converted doulbe slash comments to C comments
>
> Changes in V18:
> - Added Acked-By
> - Fix minor spacing issue
>
> Changes in V17:
> - No new changes. Rebased on net-next post 6.13 release.
>
> Changes in V16:
> - do not duplicate cleanup after devm_add_action_or_reset calls
>
> Changes in V15:
> - corrected indentation formatting error
> - Corrected TABS issue in MAINTAINER entry
>
> Changes in V14:
> - Do not attempt to unregister a netdev that is never registered
> - Added MAINTAINER entry
>
> Changes in V13:
> - Explicitly Convert PCC header from little endian to machine native
>
> Changes in V12:
> - Explicitly use little endian conversion for PCC header signature
> - Builds clean with make C=1
>
> Changes in V11:
> - Explicitly use little endian types for PCC header
>
> Changes in V11:
> - Switch Big Endian data types to machine local for PCC header
> - use mctp specific function for registering netdev
>
> Changes in V10:
> - sync with net-next branch
> - use dstats helper functions
> - remove duplicate drop stat
> - remove more double spaces
>
> Changes in V9:
> - Prerequisite patch for PCC mailbox has been merged
> - Stats collection now use helper functions
> - many double spaces reduced to single
>
> Changes in V8:
> - change 0 to NULL for pointer check of shmem
> - add semi for static version of pcc_mbox_ioremap
> - convert pcc_mbox_ioremap function to static inline when client code is not being built
> - remove shmem comment from struct pcc_chan_info descriptor
> - copy rx_dropped in mctp_pcc_net_stats
> - removed trailing newline on error message
> - removed double space in dev_dbg string
> - use big endian for header members
> - Fix use full spec ID in description
> - Fix typo in file description
> - Form the complete outbound message in the sk_buff
>
> Changes in V7:
> - Removed the Hardware address as specification is not published.
> - Map the shared buffer in the mailbox and share the mapped region with the driver
> - Use the sk_buff memory to prepare the message before copying to shared region
>
> Changes in V6:
> - Removed patch for ACPICA code that has merged
> - Includes the hardware address in the network device
> - Converted all device resources to devm resources
> - Removed mctp_pcc_driver_remove function
> - uses acpi_driver_module for initialization
> - created helper structure for in and out mailboxes
> - Consolidated code for initializing mailboxes in the add_device function
> - Added specification references
> - Removed duplicate constant PCC_ACK_FLAG_MASK
> - Use the MCTP_SIGNATURE_LENGTH define
> - made naming of header structs consistent
> - use sizeof local variables for offset calculations
> - prefix structure name to avoid potential clash
> - removed unnecessary null initialization from acpi_device_id
>
> Changes in V5
> - Removed Owner field from ACPI module declaration
> - removed unused next field from struct mctp_pcc_ndev
> - Corrected logic reading  RX ACK flag.
> - Added comment for struct pcc_chan_info field shmem_base_addr
> - check against current mtu instead of max mtu for packet length\
> - removed unnecessary lookups of pnd->mdev.dev
>
> Changes in V4
> - Read flags out of shared buffer to trigger ACK for Type 4 RX
> - Remove list of netdevs and cleanup from devices only
> - tag PCCT protocol headers as little endian
> - Remove unused constants
>
> Changes in V3
> - removed unused header
> - removed spurious space
> - removed spurious semis after functiomns
> - removed null assignment for init
> - remove redundant set of device on skb
> - tabify constant declarations
> - added  rtnl_link_stats64 function
> - set MTU to minimum to start
> - clean up logic on driver removal
> - remove cast on void * assignment
> - call cleanup function directly
> - check received length before allocating skb
> - introduce symbolic constatn for ACK FLAG MASK
> - symbolic constant for PCC header flag.
> - Add namespace ID to PCC magic
> - replaced readls with copy from io of PCC header
> - replaced custom modules init and cleanup with ACPI version
>
> Changes in V2
>
> - All Variable Declarations are in reverse Xmass Tree Format
> - All Checkpatch Warnings Are Fixed
> - Removed Dead code
> - Added packet tx/rx stats
> - Removed network physical address.  This is still in
>    disucssion in the spec, and will be added once there
>    is consensus. The protocol can be used with out it.
>    This also lead to the removal of the Big Endian
>    conversions.
> - Avoided using non volatile pointers in copy to and from io space
> - Reorderd the patches to put the ACK check for the PCC Mailbox
>    as a pre-requisite.  The corresponding change for the MCTP
>    driver has been inlined in the main patch.
> - Replaced magic numbers with constants, fixed typos, and other
>    minor changes from code review.
>
> Adam Young (1):
>    mctp pcc: Implement MCTP over PCC Transport
>
>   MAINTAINERS                 |   5 +
>   drivers/net/mctp/Kconfig    |  13 ++
>   drivers/net/mctp/Makefile   |   1 +
>   drivers/net/mctp/mctp-pcc.c | 367 ++++++++++++++++++++++++++++++++++++
>   4 files changed, 386 insertions(+)
>   create mode 100644 drivers/net/mctp/mctp-pcc.c
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next v25 1/1] mctp pcc: Implement MCTP over PCC Transport
  2025-08-27  4:48 ` [PATCH net-next v25 1/1] mctp pcc: Implement MCTP over " admiyo
@ 2025-08-27  7:36   ` Jeremy Kerr
  2025-08-27 17:54   ` kernel test robot
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 14+ messages in thread
From: Jeremy Kerr @ 2025-08-27  7:36 UTC (permalink / raw)
  To: admiyo, Matt Johnston, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni
  Cc: netdev, linux-kernel, Sudeep Holla, Jonathan Cameron, Huisong Li

Hi Adam,

> +static void mctp_pcc_client_rx_callback(struct mbox_client *c, void *buffer)
> +{
> +       struct mctp_pcc_ndev *mctp_pcc_ndev;
> +       struct sk_buff *curr_skb = NULL;
> +       struct pcc_header pcc_header;
> +       struct sk_buff *skb = NULL;
> +       struct mctp_skb_cb *cb;
> +
> +       mctp_pcc_ndev = container_of(c, struct mctp_pcc_ndev, inbox.client);
> +       if (!buffer) {
> +               dev_dstats_rx_dropped(mctp_pcc_ndev->ndev);
> +               return;
> +       }
> +
> +       spin_lock(&mctp_pcc_ndev->inbox.packets.lock);
> +       skb_queue_walk(&mctp_pcc_ndev->inbox.packets, curr_skb) {
> +               skb = curr_skb;

You're setting skb unconditionally here...

> +               if (skb->data != buffer)
> +                       continue;
> +               __skb_unlink(skb, &mctp_pcc_ndev->inbox.packets);
> +               break;
> +       }
> +       spin_unlock(&mctp_pcc_ndev->inbox.packets.lock);
> +
> +       if (skb) {

... so in the case where no skb matches, this will netif_rx() the last
skb in your list.

Only set the skb var on a match.

> +               dev_dstats_rx_add(mctp_pcc_ndev->ndev, skb->len);
> +               skb_reset_mac_header(skb);
> +               skb_pull(skb, sizeof(pcc_header));
> +               skb_reset_network_header(skb);
> +               cb = __mctp_cb(skb);
> +               cb->halen = 0;
> +               netif_rx(skb);
> +       }
> +}
> +
> +static void mctp_pcc_tx_done(struct mbox_client *c, void *mssg, int r)
> +{
> +       struct mctp_pcc_ndev *mctp_pcc_ndev;
> +       struct mctp_pcc_mailbox *box;
> +       struct sk_buff *skb = NULL;
> +       struct sk_buff *curr_skb;
> +
> +       mctp_pcc_ndev = container_of(c, struct mctp_pcc_ndev, outbox.client);
> +       box = container_of(c, struct mctp_pcc_mailbox, client);
> +       spin_lock(&box->packets.lock);
> +       skb_queue_walk(&box->packets, curr_skb) {
> +               skb = curr_skb;
> +               if (skb->data == mssg) {
> +                       __skb_unlink(skb, &box->packets);
> +                       break;
> +               }
> +       }
> +       spin_unlock(&box->packets.lock);
> +
> +       if (skb)
> +               dev_consume_skb_any(skb);

And the same with the loop logic here.

> +}
> +
> +static netdev_tx_t mctp_pcc_tx(struct sk_buff *skb, struct net_device *ndev)
> +{
> +       struct mctp_pcc_ndev *mpnd = netdev_priv(ndev);
> +       struct pcc_header *pcc_header;
> +       int len = skb->len;
> +       int rc;
> +
> +       rc = skb_cow_head(skb, sizeof(*pcc_header));
> +       if (rc) {
> +               dev_dstats_tx_dropped(ndev);
> +               kfree_skb(skb);
> +               return NETDEV_TX_OK;
> +       }
> +
> +       pcc_header = skb_push(skb, sizeof(*pcc_header));
> +       pcc_header->signature = PCC_SIGNATURE | mpnd->outbox.index;
> +       pcc_header->flags = PCC_CMD_COMPLETION_NOTIFY;
> +       memcpy(&pcc_header->command, MCTP_SIGNATURE, MCTP_SIGNATURE_LENGTH);
> +       pcc_header->length = len + MCTP_SIGNATURE_LENGTH;
> +
> +       skb_queue_head(&mpnd->outbox.packets, skb);
> +
> +       rc = mbox_send_message(mpnd->outbox.chan->mchan, skb->data);
> +
> +       if (rc < 0) {
> +               skb_unlink(skb, &mpnd->outbox.packets);
> +               return NETDEV_TX_BUSY;
> +       }

Merging your other email thread into this one:

> Which means the failed-to-send packet is unlinked.  I guess I am unclear
> if this is sufficient to deal with the packet flow control issue or not.

The unlinking is fine - the issue is that you're not doing anything to
prevent that an immediate retransmit attempt of that same skb.

> I have not yet had a setup up where I can flood the network with packets
> and see what happens if I fill up the ring buffer.  I think that is the
> most likely failure case that will lead to flow control issues.  If the
> remote side cannot handle packets as fast as they are sent, at some
> point we have to stop sending them.  The mailbox abstraction makes that
> hard to detect;

If the mbox API has any facility for you to get a notification on
available space, you could stop the queues here, and restart them on
that notification.

(or even better, you could stop the queues when the skb that fills the
channel is sent)

Does the mbox API give you any facility for tx backpressure?

Otherwise, I guess we're OK, just inefficient.

> +
> +       dev_dstats_tx_add(ndev, len);
> +       return NETDEV_TX_OK;
> +}
> +
> +static void drain_packets(struct sk_buff_head *list)
> +{
> +       struct sk_buff *skb;
> +
> +       while (!skb_queue_empty(list)) {
> +               skb = skb_dequeue(list);
> +               dev_consume_skb_any(skb);
> +       }
> +}

Might be better to avoid TOCTOU issues between the skb_queue_empty() and
skb_dequeue() - something like:

	for (;;) {
		skb = skb_dequeue(list);
		if (!skb)
			break;
		dev_consume_skb_any(skb);
	}

(or some better incantation of the loop)

However, I assume doing these without any locking is okay, if you can
guarantee that no further mbox callbacks will happen after the
free_channel() returns - and that the free is synchronous.

(or if you can't guarantee that, you have other issues in the cleanup
path)

If you do change this to lockless, it would warrant some commenting
here, because the locking would differ from other sites.


> +static int mctp_pcc_ndo_open(struct net_device *ndev)
> +{
> +       struct mctp_pcc_ndev *mctp_pcc_ndev =
> +           netdev_priv(ndev);
> +       struct mctp_pcc_mailbox *outbox =
> +           &mctp_pcc_ndev->outbox;
> +       struct mctp_pcc_mailbox *inbox =
> +           &mctp_pcc_ndev->inbox;
> +       int mctp_pcc_mtu;

Minor: as mentioned in the other thread, no need to RCT this.

> +
> +       outbox->chan = pcc_mbox_request_channel(&outbox->client, outbox->index);
> +       if (IS_ERR(outbox->chan))
> +               return PTR_ERR(outbox->chan);
> +
> +       inbox->chan = pcc_mbox_request_channel(&inbox->client, inbox->index);
> +       if (IS_ERR(inbox->chan)) {
> +               pcc_mbox_free_channel(outbox->chan);
> +               return PTR_ERR(inbox->chan);
> +       }
> +
> +       mctp_pcc_ndev->inbox.chan->rx_alloc = mctp_pcc_rx_alloc;
> +       mctp_pcc_ndev->outbox.chan->manage_writes = true;

Minor: you have the convenience vars for ->inbox and ->outbox, may as
well use them.

> +
> +       mctp_pcc_mtu = mctp_pcc_ndev->outbox.chan->shmem_size -
> +               sizeof(struct pcc_header);
> +       ndev->mtu = MCTP_MIN_MTU;
> +       ndev->max_mtu = mctp_pcc_mtu;
> +       ndev->min_mtu = MCTP_MIN_MTU;

Merging with your other thread:

> > For my own clarity, what's "the callback function used for
> > registration"?
> 
> Actually, this is not longer true: we can do it in ndo_open, and it is
> clean.  Removed the comment.

On second thought, why are these being set on ndo_open? it should be
possible to:

  $ mctp link set mctpipcc0 mtu 252
  $ mctp link set mctpipcc0 up

the latter would trigger the ->ndo_open, so you'll lose the MTU set up
from the former. You probably don't want to lose settings on a down->up
cycle.

(also, what does the 'i' signify, in mctpipcc?)

> +
> +       return 0;
> +}
> +
> +static int mctp_pcc_ndo_stop(struct net_device *ndev)
> +{
> +       struct mctp_pcc_ndev *mctp_pcc_ndev =
> +           netdev_priv(ndev);
> +       struct mctp_pcc_mailbox *outbox =
> +           &mctp_pcc_ndev->outbox;
> +       struct mctp_pcc_mailbox *inbox =
> +           &mctp_pcc_ndev->inbox;
> +
> +       pcc_mbox_free_channel(outbox->chan);
> +       pcc_mbox_free_channel(inbox->chan);
> +
> +       drain_packets(&mctp_pcc_ndev->outbox.packets);
> +       drain_packets(&mctp_pcc_ndev->inbox.packets);
> +       return 0;
> +}
> +
> +static const struct net_device_ops mctp_pcc_netdev_ops = {
> +       .ndo_open = mctp_pcc_ndo_open,
> +       .ndo_stop = mctp_pcc_ndo_stop,
> +       .ndo_start_xmit = mctp_pcc_tx,
> +
> +};
> +
> +static void mctp_pcc_setup(struct net_device *ndev)
> +{
> +       ndev->type = ARPHRD_MCTP;
> +       ndev->hard_header_len = 0;
> +       ndev->tx_queue_len = 0;
> +       ndev->flags = IFF_NOARP;
> +       ndev->netdev_ops = &mctp_pcc_netdev_ops;
> +       ndev->needs_free_netdev = true;
> +       ndev->pcpu_stat_type = NETDEV_PCPU_STAT_DSTATS;
> +}
> +
> +struct mctp_pcc_lookup_context {
> +       int index;
> +       u32 inbox_index;
> +       u32 outbox_index;
> +};
> +
> +static acpi_status lookup_pcct_indices(struct acpi_resource *ares,
> +                                      void *context)
> +{
> +       struct mctp_pcc_lookup_context *luc = context;
> +       struct acpi_resource_address32 *addr;
> +
> +       if (ares->type != PCC_DWORD_TYPE)
> +               return AE_OK;
> +
> +       addr = ACPI_CAST_PTR(struct acpi_resource_address32, &ares->data);
> +       switch (luc->index) {
> +       case 0:
> +               luc->outbox_index = addr[0].address.minimum;
> +               break;
> +       case 1:
> +               luc->inbox_index = addr[0].address.minimum;
> +               break;
> +       }
> +       luc->index++;
> +       return AE_OK;
> +}
> +
> +static void mctp_cleanup_netdev(void *data)
> +{
> +       struct net_device *ndev = data;
> +
> +       mctp_unregister_netdev(ndev);
> +}
> +
> +static int mctp_pcc_initialize_mailbox(struct device *dev,
> +                                      struct mctp_pcc_mailbox *box, u32 index)
> +{
> +       box->index = index;
> +       skb_queue_head_init(&box->packets);
> +       box->client.dev = dev;
> +       return 0;
> +}
> +
> +static int mctp_pcc_driver_add(struct acpi_device *acpi_dev)
> +{
> +       struct mctp_pcc_lookup_context context = {0};
> +       struct mctp_pcc_ndev *mctp_pcc_ndev;
> +       struct device *dev = &acpi_dev->dev;
> +       struct net_device *ndev;
> +       acpi_handle dev_handle;
> +       acpi_status status;
> +       char name[32];
> +       int rc;
> +
> +       dev_dbg(dev, "Adding mctp_pcc device for HID %s\n",
> +               acpi_device_hid(acpi_dev));
> +       dev_handle = acpi_device_handle(acpi_dev);
> +       status = acpi_walk_resources(dev_handle, "_CRS", lookup_pcct_indices,
> +                                    &context);
> +       if (!ACPI_SUCCESS(status)) {
> +               dev_err(dev, "FAILURE to lookup PCC indexes from CRS\n");
> +               return -EINVAL;
> +       }
> +
> +       snprintf(name, sizeof(name), "mctpipcc%d", context.inbox_index);
> +       ndev = alloc_netdev(sizeof(*mctp_pcc_ndev), name, NET_NAME_PREDICTABLE,
> +                           mctp_pcc_setup);
> +       if (!ndev)
> +               return -ENOMEM;
> +
> +       mctp_pcc_ndev = netdev_priv(ndev);
> +
> +       /* inbox initialization */
> +       rc = mctp_pcc_initialize_mailbox(dev, &mctp_pcc_ndev->inbox,
> +                                        context.inbox_index);
> +       if (rc)
> +               goto free_netdev;
> +
> +       mctp_pcc_ndev->inbox.client.rx_callback = mctp_pcc_client_rx_callback;
> +
> +       /* outbox initialization */
> +       rc = mctp_pcc_initialize_mailbox(dev, &mctp_pcc_ndev->outbox,
> +                                        context.outbox_index);
> +       if (rc)
> +               goto free_netdev;
> +
> +       mctp_pcc_ndev->outbox.client.tx_done = mctp_pcc_tx_done;
> +       mctp_pcc_ndev->acpi_device = acpi_dev;
> +       mctp_pcc_ndev->ndev = ndev;
> +       acpi_dev->driver_data = mctp_pcc_ndev;
> +
> +       /* ndev needs to be freed before the iomemory (mapped above) gets
> +        * unmapped,  devm resources get freed in reverse to the order they
> +        * are added.
> +        */

minor: this comment seems stale; we're not doing any mapping or
unmapping here. What you care about is that the netdev is unregistered
(and hence ->ndo_stop() invoked, freeing the allocated channels) when
the struct acpi_device is destroyed.

Cheers,


Jeremy

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next v25 1/1] mctp pcc: Implement MCTP over PCC Transport
  2025-08-27  4:48 ` [PATCH net-next v25 1/1] mctp pcc: Implement MCTP over " admiyo
  2025-08-27  7:36   ` Jeremy Kerr
@ 2025-08-27 17:54   ` kernel test robot
  2025-08-29  9:17   ` kernel test robot
  2025-08-29 19:32   ` ALOK TIWARI
  3 siblings, 0 replies; 14+ messages in thread
From: kernel test robot @ 2025-08-27 17:54 UTC (permalink / raw)
  To: admiyo, Jeremy Kerr, Matt Johnston, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: llvm, oe-kbuild-all, netdev, linux-kernel, Sudeep Holla,
	Jonathan Cameron, Huisong Li

Hi,

kernel test robot noticed the following build warnings:

[auto build test WARNING on net-next/main]

url:    https://github.com/intel-lab-lkp/linux/commits/admiyo-os-amperecomputing-com/mctp-pcc-Implement-MCTP-over-PCC-Transport/20250827-124953
base:   net-next/main
patch link:    https://lore.kernel.org/r/20250827044810.152775-2-admiyo%40os.amperecomputing.com
patch subject: [PATCH net-next v25 1/1] mctp pcc: Implement MCTP over PCC Transport
config: i386-randconfig-002-20250827 (https://download.01.org/0day-ci/archive/20250828/202508280145.bix2s4fv-lkp@intel.com/config)
compiler: clang version 20.1.8 (https://github.com/llvm/llvm-project 87f0227cb60147a26a1eeb4fb06e3b505e9c7261)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250828/202508280145.bix2s4fv-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202508280145.bix2s4fv-lkp@intel.com/

All warnings (new ones prefixed by >>):

>> drivers/net/mctp/mctp-pcc.c:110:24: warning: variable 'mctp_pcc_ndev' set but not used [-Wunused-but-set-variable]
     110 |         struct mctp_pcc_ndev *mctp_pcc_ndev;
         |                               ^
   1 warning generated.


vim +/mctp_pcc_ndev +110 drivers/net/mctp/mctp-pcc.c

   107	
   108	static void mctp_pcc_tx_done(struct mbox_client *c, void *mssg, int r)
   109	{
 > 110		struct mctp_pcc_ndev *mctp_pcc_ndev;
   111		struct mctp_pcc_mailbox *box;
   112		struct sk_buff *skb = NULL;
   113		struct sk_buff *curr_skb;
   114	
   115		mctp_pcc_ndev = container_of(c, struct mctp_pcc_ndev, outbox.client);
   116		box = container_of(c, struct mctp_pcc_mailbox, client);
   117		spin_lock(&box->packets.lock);
   118		skb_queue_walk(&box->packets, curr_skb) {
   119			skb = curr_skb;
   120			if (skb->data == mssg) {
   121				__skb_unlink(skb, &box->packets);
   122				break;
   123			}
   124		}
   125		spin_unlock(&box->packets.lock);
   126	
   127		if (skb)
   128			dev_consume_skb_any(skb);
   129	}
   130	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next v25 1/1] mctp pcc: Implement MCTP over PCC Transport
  2025-08-27  4:48 ` [PATCH net-next v25 1/1] mctp pcc: Implement MCTP over " admiyo
  2025-08-27  7:36   ` Jeremy Kerr
  2025-08-27 17:54   ` kernel test robot
@ 2025-08-29  9:17   ` kernel test robot
  2025-08-29 19:32   ` ALOK TIWARI
  3 siblings, 0 replies; 14+ messages in thread
From: kernel test robot @ 2025-08-29  9:17 UTC (permalink / raw)
  To: admiyo, Jeremy Kerr, Matt Johnston, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: oe-kbuild-all, netdev, linux-kernel, Sudeep Holla,
	Jonathan Cameron, Huisong Li

Hi,

kernel test robot noticed the following build errors:

[auto build test ERROR on net-next/main]

url:    https://github.com/intel-lab-lkp/linux/commits/admiyo-os-amperecomputing-com/mctp-pcc-Implement-MCTP-over-PCC-Transport/20250827-124953
base:   net-next/main
patch link:    https://lore.kernel.org/r/20250827044810.152775-2-admiyo%40os.amperecomputing.com
patch subject: [PATCH net-next v25 1/1] mctp pcc: Implement MCTP over PCC Transport
config: i386-randconfig-015-20250829 (https://download.01.org/0day-ci/archive/20250829/202508291628.CPVGma07-lkp@intel.com/config)
compiler: gcc-12 (Debian 12.2.0-14+deb12u1) 12.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250829/202508291628.CPVGma07-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202508291628.CPVGma07-lkp@intel.com/

All errors (new ones prefixed by >>):

   ld: drivers/net/mctp/mctp-pcc.o: in function `mctp_pcc_tx':
>> drivers/net/mctp/mctp-pcc.c:153: undefined reference to `mbox_send_message'


vim +153 drivers/net/mctp/mctp-pcc.c

   130	
   131	static netdev_tx_t mctp_pcc_tx(struct sk_buff *skb, struct net_device *ndev)
   132	{
   133		struct mctp_pcc_ndev *mpnd = netdev_priv(ndev);
   134		struct pcc_header *pcc_header;
   135		int len = skb->len;
   136		int rc;
   137	
   138		rc = skb_cow_head(skb, sizeof(*pcc_header));
   139		if (rc) {
   140			dev_dstats_tx_dropped(ndev);
   141			kfree_skb(skb);
   142			return NETDEV_TX_OK;
   143		}
   144	
   145		pcc_header = skb_push(skb, sizeof(*pcc_header));
   146		pcc_header->signature = PCC_SIGNATURE | mpnd->outbox.index;
   147		pcc_header->flags = PCC_CMD_COMPLETION_NOTIFY;
   148		memcpy(&pcc_header->command, MCTP_SIGNATURE, MCTP_SIGNATURE_LENGTH);
   149		pcc_header->length = len + MCTP_SIGNATURE_LENGTH;
   150	
   151		skb_queue_head(&mpnd->outbox.packets, skb);
   152	
 > 153		rc = mbox_send_message(mpnd->outbox.chan->mchan, skb->data);
   154	
   155		if (rc < 0) {
   156			skb_unlink(skb, &mpnd->outbox.packets);
   157			return NETDEV_TX_BUSY;
   158		}
   159	
   160		dev_dstats_tx_add(ndev, len);
   161		return NETDEV_TX_OK;
   162	}
   163	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next v25 1/1] mctp pcc: Implement MCTP over PCC Transport
  2025-08-27  4:48 ` [PATCH net-next v25 1/1] mctp pcc: Implement MCTP over " admiyo
                     ` (2 preceding siblings ...)
  2025-08-29  9:17   ` kernel test robot
@ 2025-08-29 19:32   ` ALOK TIWARI
  2025-09-01  0:59     ` Jeremy Kerr
  3 siblings, 1 reply; 14+ messages in thread
From: ALOK TIWARI @ 2025-08-29 19:32 UTC (permalink / raw)
  To: admiyo, Jeremy Kerr, Matt Johnston, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: netdev, linux-kernel, Sudeep Holla, Jonathan Cameron, Huisong Li



On 8/27/2025 10:18 AM, admiyo@os.amperecomputing.com wrote:
> From: Adam Young <admiyo@os.amperecomputing.com>
> 
> Implementation of network driver for
> Management Control Transport Protocol(MCTP)

it is Management Component Transport Protocol (see DMTF spec).

> over Platform Communication Channel(PCC)
> 
> DMTF DSP:0292
> https://urldefense.com/v3/__https://www.dmtf.org/sites/default/files/standards/documents/*__;Lw!!ACWV5N9M2RV99hQ!JkW80M1xGJjvaXd72o192mqV0uzOu511ibGTz-JCtmbsM_IrpuZ0jJeeQFOug5UHp8fNY1dX2Lk0_hPUUY__KokwgtE$
> DSP0292_1.0.0WIP50.pdf
> 
> MCTP devices are specified via ACPI by entries
> in DSDT/SDST and reference channels specified

SDST -> SSDT

> in the PCCT.  Messages are sent on a type 3 and
> received on a type 4 channel.  Communication with
> other devices use the PCC based doorbell mechanism;
> a shared memory segment with a corresponding
> interrupt and a memory register used to trigger
> remote interrupts.
> 
> This driver takes advantage of PCC mailbox buffer
> management. The data section of the struct sk_buff
> that contains the outgoing packet is sent to the mailbox,
> already properly formatted  as a PCC message.  The driver
> is also responsible for allocating a struct sk_buff that
> is then passed to the mailbox and used to record the
> data in the shared buffer. It maintains a list of both
> outging and incoming sk_buffs to match the data buffers

outging

> 
> When the Type 3 channel outbox receives a txdone response
> interrupt, it consumes the outgoing sk_buff, allowing
> it to be freed.
> 
> Bringing the interface up and down creates and frees
> the channel between the network driver and the mailbox
> driver. Freeing the channel also frees any packets that
> are cached in the mailbox ringbuffer.
> 
> Signed-off-by: Adam Young <admiyo@os.amperecomputing.com>
> ---
>   MAINTAINERS                 |   5 +
>   drivers/net/mctp/Kconfig    |  13 ++
>   drivers/net/mctp/Makefile   |   1 +
>   drivers/net/mctp/mctp-pcc.c | 367 ++++++++++++++++++++++++++++++++++++
>   4 files changed, 386 insertions(+)
>   create mode 100644 drivers/net/mctp/mctp-pcc.c
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index bce96dd254b8..de359bddcb2f 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -14660,6 +14660,11 @@ F:	include/net/mctpdevice.h
>   F:	include/net/netns/mctp.h
>   F:	net/mctp/
>   
> +MANAGEMENT COMPONENT TRANSPORT PROTOCOL (MCTP) over PCC (MCTP-PCC) Driver
> +M:	Adam Young <admiyo@os.amperecomputing.com>
> +S:	Maintained
> +F:	drivers/net/mctp/mctp-pcc.c
> +
>   MAPLE TREE
>   M:	Liam R. Howlett <Liam.Howlett@oracle.com>
>   L:	maple-tree@lists.infradead.org
> diff --git a/drivers/net/mctp/Kconfig b/drivers/net/mctp/Kconfig
> index cf325ab0b1ef..f69d0237f058 100644
> --- a/drivers/net/mctp/Kconfig
> +++ b/drivers/net/mctp/Kconfig
> @@ -57,6 +57,19 @@ config MCTP_TRANSPORT_USB
>   	  MCTP-over-USB interfaces are peer-to-peer, so each interface
>   	  represents a physical connection to one remote MCTP endpoint.
>   
> +config MCTP_TRANSPORT_PCC
> +	tristate "MCTP PCC transport"
> +	depends on ACPI
> +	help
> +	  Provides a driver to access MCTP devices over PCC transport,
> +	  A MCTP protocol network device is created via ACPI for each
> +	  entry in the DST/SDST that matches the identifier. The Platform

should be DSDT/SSDT ?

> +	  communication channels are selected from the corresponding
> +	  entries in the PCCT.
> +
> +	  Say y here if you need to connect to MCTP endpoints over PCC. To
> +	  compile as a module, use m; the module will be called mctp-pcc.
> +
>   endmenu
>   
>   endif
> diff --git a/drivers/net/mctp/Makefile b/drivers/net/mctp/Makefile
> index c36006849a1e..2276f148df7c 100644
> --- a/drivers/net/mctp/Makefile
> +++ b/drivers/net/mctp/Makefile
> @@ -1,3 +1,4 @@
> +obj-$(CONFIG_MCTP_TRANSPORT_PCC) += mctp-pcc.o
>   obj-$(CONFIG_MCTP_SERIAL) += mctp-serial.o
>   obj-$(CONFIG_MCTP_TRANSPORT_I2C) += mctp-i2c.o
>   obj-$(CONFIG_MCTP_TRANSPORT_I3C) += mctp-i3c.o
> diff --git a/drivers/net/mctp/mctp-pcc.c b/drivers/net/mctp/mctp-pcc.c
> new file mode 100644
> index 000000000000..c6578b27c00c
> --- /dev/null
> +++ b/drivers/net/mctp/mctp-pcc.c
> @@ -0,0 +1,367 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * mctp-pcc.c - Driver for MCTP over PCC.
> + * Copyright (c) 2024-2025, Ampere Computing LLC
> + *
> + */
> +
> +/* Implementation of MCTP over PCC DMTF Specification DSP0256

DSP0256 vs DSP0292 mismatch

> + * https://urldefense.com/v3/__https://www.dmtf.org/sites/default/files/standards/documents/DSP0292_1.0.0WIP50.pdf__;!!ACWV5N9M2RV99hQ!JkW80M1xGJjvaXd72o192mqV0uzOu511ibGTz-JCtmbsM_IrpuZ0jJeeQFOug5UHp8fNY1dX2Lk0_hPUUY__EEa7amg$
> + */
> +
> +#include <linux/acpi.h>
> +#include <linux/if_arp.h>
> +#include <linux/init.h>
> +#include <linux/kernel.h>
> +#include <linux/mailbox_client.h>
> +#include <linux/module.h>
> +#include <linux/netdevice.h>
> +#include <linux/platform_device.h>
> +#include <linux/string.h>
> +#include <linux/skbuff.h>
> +#include <linux/hrtimer.h>
> +
> +#include <acpi/acpi_bus.h>
> +#include <acpi/acpi_drivers.h>
> +#include <acpi/acrestyp.h>
> +#include <acpi/actbl.h>
> +#include <net/mctp.h>
> +#include <net/mctpdevice.h>
> +#include <acpi/pcc.h>
> +
> +#define MCTP_SIGNATURE          "MCTP"
> +#define MCTP_SIGNATURE_LENGTH   (sizeof(MCTP_SIGNATURE) - 1)
> +#define MCTP_MIN_MTU            68
> +#define PCC_DWORD_TYPE          0x0c
> +
> +struct mctp_pcc_mailbox {
> +	u32 index;
> +	struct pcc_mbox_chan *chan;
> +	struct mbox_client client;
> +	struct sk_buff_head packets;
> +};
> +
> +/* The netdev structure. One of these per PCC adapter. */
> +struct mctp_pcc_ndev {
> +	struct net_device *ndev;
> +	struct acpi_device *acpi_device;
> +	struct mctp_pcc_mailbox inbox;
> +	struct mctp_pcc_mailbox outbox;
> +};
> +
> +static void *mctp_pcc_rx_alloc(struct mbox_client *c, int size)
> +{
> +	struct mctp_pcc_ndev *mctp_pcc_ndev;
> +	struct mctp_pcc_mailbox *box;
> +	struct sk_buff *skb;
> +
> +	mctp_pcc_ndev =	container_of(c, struct mctp_pcc_ndev, inbox.client);
> +	box = &mctp_pcc_ndev->inbox;
> +
> +	if (size > mctp_pcc_ndev->ndev->mtu)
> +		return NULL;
> +	skb = netdev_alloc_skb(mctp_pcc_ndev->ndev, size);
> +	if (!skb)
> +		return NULL;
> +	skb_put(skb, size);
> +	skb->protocol = htons(ETH_P_MCTP);
> +	skb_queue_head(&box->packets, skb);
> +
> +	return skb->data;
> +}
> +
> +static void mctp_pcc_client_rx_callback(struct mbox_client *c, void *buffer)
> +{
> +	struct mctp_pcc_ndev *mctp_pcc_ndev;
> +	struct sk_buff *curr_skb = NULL;
> +	struct pcc_header pcc_header;
> +	struct sk_buff *skb = NULL;
> +	struct mctp_skb_cb *cb;
> +
> +	mctp_pcc_ndev = container_of(c, struct mctp_pcc_ndev, inbox.client);
> +	if (!buffer) {
> +		dev_dstats_rx_dropped(mctp_pcc_ndev->ndev);
> +		return;
> +	}
> +
> +	spin_lock(&mctp_pcc_ndev->inbox.packets.lock);
> +	skb_queue_walk(&mctp_pcc_ndev->inbox.packets, curr_skb) {
> +		skb = curr_skb;
> +		if (skb->data != buffer)
> +			continue;
> +		__skb_unlink(skb, &mctp_pcc_ndev->inbox.packets);
> +		break;
> +	}
> +	spin_unlock(&mctp_pcc_ndev->inbox.packets.lock);
> +
> +	if (skb) {
> +		dev_dstats_rx_add(mctp_pcc_ndev->ndev, skb->len);
> +		skb_reset_mac_header(skb);
> +		skb_pull(skb, sizeof(pcc_header));
> +		skb_reset_network_header(skb);
> +		cb = __mctp_cb(skb);
> +		cb->halen = 0;
> +		netif_rx(skb);
> +	}
> +}
> +
> +static void mctp_pcc_tx_done(struct mbox_client *c, void *mssg, int r)
> +{
> +	struct mctp_pcc_ndev *mctp_pcc_ndev;
> +	struct mctp_pcc_mailbox *box;
> +	struct sk_buff *skb = NULL;
> +	struct sk_buff *curr_skb;
> +
> +	mctp_pcc_ndev = container_of(c, struct mctp_pcc_ndev, outbox.client);
> +	box = container_of(c, struct mctp_pcc_mailbox, client);
> +	spin_lock(&box->packets.lock);
> +	skb_queue_walk(&box->packets, curr_skb) {
> +		skb = curr_skb;
> +		if (skb->data == mssg) {
> +			__skb_unlink(skb, &box->packets);
> +			break;
> +		}
> +	}
> +	spin_unlock(&box->packets.lock);
> +
> +	if (skb)
> +		dev_consume_skb_any(skb);
> +}
> +
> +static netdev_tx_t mctp_pcc_tx(struct sk_buff *skb, struct net_device *ndev)
> +{
> +	struct mctp_pcc_ndev *mpnd = netdev_priv(ndev);
> +	struct pcc_header *pcc_header;
> +	int len = skb->len;
> +	int rc;
> +
> +	rc = skb_cow_head(skb, sizeof(*pcc_header));
> +	if (rc) {
> +		dev_dstats_tx_dropped(ndev);
> +		kfree_skb(skb);
> +		return NETDEV_TX_OK;
> +	}
> +
> +	pcc_header = skb_push(skb, sizeof(*pcc_header));
> +	pcc_header->signature = PCC_SIGNATURE | mpnd->outbox.index;
> +	pcc_header->flags = PCC_CMD_COMPLETION_NOTIFY;
> +	memcpy(&pcc_header->command, MCTP_SIGNATURE, MCTP_SIGNATURE_LENGTH);
> +	pcc_header->length = len + MCTP_SIGNATURE_LENGTH;
> +
> +	skb_queue_head(&mpnd->outbox.packets, skb);
> +
> +	rc = mbox_send_message(mpnd->outbox.chan->mchan, skb->data);
> +
> +	if (rc < 0) {
> +		skb_unlink(skb, &mpnd->outbox.packets);
> +		return NETDEV_TX_BUSY;
> +	}
> +
> +	dev_dstats_tx_add(ndev, len);
> +	return NETDEV_TX_OK;
> +}
> +
> +static void drain_packets(struct sk_buff_head *list)
> +{
> +	struct sk_buff *skb;
> +
> +	while (!skb_queue_empty(list)) {
> +		skb = skb_dequeue(list);
> +		dev_consume_skb_any(skb);
> +	}
> +}
> +
> +static int mctp_pcc_ndo_open(struct net_device *ndev)
> +{
> +	struct mctp_pcc_ndev *mctp_pcc_ndev =
> +	    netdev_priv(ndev);
> +	struct mctp_pcc_mailbox *outbox =
> +	    &mctp_pcc_ndev->outbox;
> +	struct mctp_pcc_mailbox *inbox =
> +	    &mctp_pcc_ndev->inbox;
> +	int mctp_pcc_mtu;
> +
> +	outbox->chan = pcc_mbox_request_channel(&outbox->client, outbox->index);
> +	if (IS_ERR(outbox->chan))
> +		return PTR_ERR(outbox->chan);
> +
> +	inbox->chan = pcc_mbox_request_channel(&inbox->client, inbox->index);
> +	if (IS_ERR(inbox->chan)) {
> +		pcc_mbox_free_channel(outbox->chan);
> +		return PTR_ERR(inbox->chan);
> +	}
> +
> +	mctp_pcc_ndev->inbox.chan->rx_alloc = mctp_pcc_rx_alloc;
> +	mctp_pcc_ndev->outbox.chan->manage_writes = true;
> +
> +	mctp_pcc_mtu = mctp_pcc_ndev->outbox.chan->shmem_size -
> +		sizeof(struct pcc_header);
> +	ndev->mtu = MCTP_MIN_MTU;
> +	ndev->max_mtu = mctp_pcc_mtu;
> +	ndev->min_mtu = MCTP_MIN_MTU;
> +
> +	return 0;
> +}
> +
> +static int mctp_pcc_ndo_stop(struct net_device *ndev)
> +{
> +	struct mctp_pcc_ndev *mctp_pcc_ndev =
> +	    netdev_priv(ndev);
> +	struct mctp_pcc_mailbox *outbox =
> +	    &mctp_pcc_ndev->outbox;
> +	struct mctp_pcc_mailbox *inbox =
> +	    &mctp_pcc_ndev->inbox;
> +
> +	pcc_mbox_free_channel(outbox->chan);
> +	pcc_mbox_free_channel(inbox->chan);
> +
> +	drain_packets(&mctp_pcc_ndev->outbox.packets);
> +	drain_packets(&mctp_pcc_ndev->inbox.packets);
> +	return 0;
> +}
> +
> +static const struct net_device_ops mctp_pcc_netdev_ops = {
> +	.ndo_open = mctp_pcc_ndo_open,
> +	.ndo_stop = mctp_pcc_ndo_stop,
> +	.ndo_start_xmit = mctp_pcc_tx,
> +
> +};
> +
> +static void mctp_pcc_setup(struct net_device *ndev)
> +{
> +	ndev->type = ARPHRD_MCTP;
> +	ndev->hard_header_len = 0;
> +	ndev->tx_queue_len = 0;
> +	ndev->flags = IFF_NOARP;
> +	ndev->netdev_ops = &mctp_pcc_netdev_ops;
> +	ndev->needs_free_netdev = true;
> +	ndev->pcpu_stat_type = NETDEV_PCPU_STAT_DSTATS;
> +}
> +
> +struct mctp_pcc_lookup_context {
> +	int index;
> +	u32 inbox_index;
> +	u32 outbox_index;
> +};
> +
> +static acpi_status lookup_pcct_indices(struct acpi_resource *ares,
> +				       void *context)
> +{
> +	struct mctp_pcc_lookup_context *luc = context;
> +	struct acpi_resource_address32 *addr;
> +
> +	if (ares->type != PCC_DWORD_TYPE)
> +		return AE_OK;
> +
> +	addr = ACPI_CAST_PTR(struct acpi_resource_address32, &ares->data);
> +	switch (luc->index) {
> +	case 0:
> +		luc->outbox_index = addr[0].address.minimum;
> +		break;
> +	case 1:
> +		luc->inbox_index = addr[0].address.minimum;
> +		break;
> +	}
> +	luc->index++;
> +	return AE_OK;
> +}
> +
> +static void mctp_cleanup_netdev(void *data)
> +{
> +	struct net_device *ndev = data;
> +
> +	mctp_unregister_netdev(ndev);
> +}
> +
> +static int mctp_pcc_initialize_mailbox(struct device *dev,
> +				       struct mctp_pcc_mailbox *box, u32 index)
> +{
> +	box->index = index;
> +	skb_queue_head_init(&box->packets);
> +	box->client.dev = dev;
> +	return 0;
> +}
> +
> +static int mctp_pcc_driver_add(struct acpi_device *acpi_dev)
> +{
> +	struct mctp_pcc_lookup_context context = {0};
> +	struct mctp_pcc_ndev *mctp_pcc_ndev;
> +	struct device *dev = &acpi_dev->dev;
> +	struct net_device *ndev;
> +	acpi_handle dev_handle;
> +	acpi_status status;
> +	char name[32];
> +	int rc;
> +
> +	dev_dbg(dev, "Adding mctp_pcc device for HID %s\n",
> +		acpi_device_hid(acpi_dev));
> +	dev_handle = acpi_device_handle(acpi_dev);
> +	status = acpi_walk_resources(dev_handle, "_CRS", lookup_pcct_indices,
> +				     &context);
> +	if (!ACPI_SUCCESS(status)) {
> +		dev_err(dev, "FAILURE to lookup PCC indexes from CRS\n");

FAILURE to lookup -> failed to lookup

> +		return -EINVAL;
> +	}
> +
> +	snprintf(name, sizeof(name), "mctpipcc%d", context.inbox_index);

mctp_pcc%d ?

> +	ndev = alloc_netdev(sizeof(*mctp_pcc_ndev), name, NET_NAME_PREDICTABLE,
> +			    mctp_pcc_setup);
> +	if (!ndev)
> +		return -ENOMEM;
> +
> +	mctp_pcc_ndev = netdev_priv(ndev);
> +
> +	/* inbox initialization */
> +	rc = mctp_pcc_initialize_mailbox(dev, &mctp_pcc_ndev->inbox,
> +					 context.inbox_index);
> +	if (rc)
> +		goto free_netdev;
> +
> +	mctp_pcc_ndev->inbox.client.rx_callback = mctp_pcc_client_rx_callback;
> +
> +	/* outbox initialization */
> +	rc = mctp_pcc_initialize_mailbox(dev, &mctp_pcc_ndev->outbox,
> +					 context.outbox_index);
> +	if (rc)
> +		goto free_netdev;
> +
> +	mctp_pcc_ndev->outbox.client.tx_done = mctp_pcc_tx_done;
> +	mctp_pcc_ndev->acpi_device = acpi_dev;
> +	mctp_pcc_ndev->ndev = ndev;
> +	acpi_dev->driver_data = mctp_pcc_ndev;
> +
> +	/* ndev needs to be freed before the iomemory (mapped above) gets
> +	 * unmapped,  devm resources get freed in reverse to the order they
> +	 * are added.
> +	 */
> +	rc = mctp_register_netdev(ndev, NULL, MCTP_PHYS_BINDING_PCC);
> +	if (rc)
> +		goto free_netdev;
> +
> +	return devm_add_action_or_reset(dev, mctp_cleanup_netdev, ndev);
> +free_netdev:
> +	free_netdev(ndev);
> +	return rc;
> +}
> +
> +static const struct acpi_device_id mctp_pcc_device_ids[] = {
> +	{ "DMT0001" },
> +	{}
> +};
> +
> +static struct acpi_driver mctp_pcc_driver = {
> +	.name = "mctp_pcc",
> +	.class = "Unknown",
> +	.ids = mctp_pcc_device_ids,
> +	.ops = {
> +		.add = mctp_pcc_driver_add,
> +	},
> +};
> +
> +module_acpi_driver(mctp_pcc_driver);
> +
> +MODULE_DEVICE_TABLE(acpi, mctp_pcc_device_ids);
> +
> +MODULE_DESCRIPTION("MCTP PCC ACPI device");
> +MODULE_LICENSE("GPL");
> +MODULE_AUTHOR("Adam Young <admiyo@os.amperecomputing.com>");

Thanks,
Alok


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next v25 1/1] mctp pcc: Implement MCTP over PCC Transport
  2025-08-29 19:32   ` ALOK TIWARI
@ 2025-09-01  0:59     ` Jeremy Kerr
  0 siblings, 0 replies; 14+ messages in thread
From: Jeremy Kerr @ 2025-09-01  0:59 UTC (permalink / raw)
  To: ALOK TIWARI, admiyo, Matt Johnston, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: netdev, linux-kernel, Sudeep Holla, Jonathan Cameron, Huisong Li

Hi Adam,

+1 on all the recommended changes from Alok, except for:

> > +               return -EINVAL;
> > +       }
> > +
> > +       snprintf(name, sizeof(name), "mctpipcc%d", context.inbox_index);
> 
> mctp_pcc%d ?

Please keep this as you have in v27, as mctppcc%d, so we're consistent
with other MCTP interface naming.

Cheers,



Jeremy

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2025-09-01  1:07 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-27  4:48 [PATCH net-next v25 0/1] MCTP Over PCC Transport admiyo
2025-08-27  4:48 ` [PATCH net-next v25 1/1] mctp pcc: Implement MCTP over " admiyo
2025-08-27  7:36   ` Jeremy Kerr
2025-08-27 17:54   ` kernel test robot
2025-08-29  9:17   ` kernel test robot
2025-08-29 19:32   ` ALOK TIWARI
2025-09-01  0:59     ` Jeremy Kerr
2025-08-27  4:57 ` [PATCH net-next v25 0/1] MCTP Over " Adam Young
  -- strict thread matches above, loose matches on Subject: below --
2025-08-19 20:51 admiyo
2025-08-19 20:51 ` [PATCH net-next v25 1/1] mctp pcc: Implement MCTP over " admiyo
2025-08-22  8:21   ` Jeremy Kerr
2025-08-26 20:51     ` Adam Young
2025-08-27  1:37       ` Jeremy Kerr
2025-08-27  4:55         ` Adam Young
2025-08-26 22:37     ` Adam Young

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).