linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v24 0/1] MCTP Over PCC Transport
@ 2025-08-11 15:38 admiyo
  2025-08-11 15:38 ` [PATCH v24 1/1] mctp pcc: Implement MCTP over " admiyo
  0 siblings, 1 reply; 6+ messages in thread
From: admiyo @ 2025-08-11 15:38 UTC (permalink / raw)
  Cc: netdev, linux-kernel, Jeremy Kerr, Matt Johnston,
	David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Sudeep Holla, Jonathan Cameron, Huisong Li

From: Linux Bot <linuxbot@amperecomputing.com>

This series adds support for the Management Control Transport Protocol (MCTP)
over the Platform Communication Channel (PCC) mechanism.

DMTF DSP:0292
https://www.dmtf.org/sites/default/files/standards/documents/DSP0292_1.0.0WIP50.pdf

MCTP defines a communication model intended to
facilitate communication between Management controllers
and other management controllers, and between Management
controllers and management devices

PCC is a mechanism for communication between components within
the  Platform.  It is a composed of shared memory regions,
interrupt registers, and status registers.

The MCTP over PCC driver makes use of two PCC channels. For
sending messages, it uses a Type 3 channel, and for receiving
messages it uses the paired Type 4 channel.  The device
and corresponding channels are specified via ACPI.

The first patch in the series implements a mechanism to allow the driver
to indicate whether an ACK should be sent back to the caller
after processing the interrupt.  This is an optional feature in
the PCC code, but has been made explicitly required in another driver.
The implementation here maintains the backwards compatibility of that
driver.

MCTP is a general purpose  protocol so  it would  be impossible to enumerate
all the use cases, but some of the ones that are most topical are attestation
and RAS support.  There are a handful of protocols built on top of MCTP, to
include PLDM and SPDM, both specified by the DMTF.

https://www.dmtf.org/sites/default/files/standards/documents/DSP0240_1.0.0.pdf
https://www.dmtf.org/sites/default/files/standards/documents/DSP0274_1.3.0.pd

SPDM entails various usages, including device identity collection, device
authentication, measurement collection, and device secure session establishment.

PLDM is more likely to be used  for hardware support: temperature, voltage, or
fan sensor control.

At least two companies have devices that can make use of the mechanism. One is
Ampere Computing, my employer.

The mechanism it uses is called Platform Communication Channels is part of the
ACPI spec: https://uefi.org/htmlspecs/ACPI_Spec_6_4_html/14_Platform_Communications_Channel/Platform_Comm_Channel.html

Since it is a socket interface, the system administrator also has  the ability
to ignore an MCTP link that they do not want to enable.  This link would be visible
to the end user, but would not be usable.

If MCTP support is disabled in the Kernel, this driver would also be disabled.

PCC is based on a shared buffer and a set of I/O mapped memory locations that the
Spec calls registers.  This mechanism exists regardless of the existence of the
driver. Thus, if the user has the ability to map these  physical location to
virtual locations, they have the ability to drive the hardware.  Thus, there
is a security aspect to this mechanism that extends beyond the responsibilities
of the operating system.

If the hardware does not expose the PCC in the ACPI table, this device will never
be enabled.  Thus it is only an issue on hard that does support PCC.  In that case,
it is up to the remote controller to sanitize communication; MCTP will be exposed
as a socket interface, and userland can send any crafted packet it wants.  It would
thus also be incumbent on the hardware manufacturer to allow the end user to disable
MCTP over PCC communication if they did not want to expose it.

Previous implementations of the pcc version of the mailbox protocol assumed the
driver was directly managing the shared memory region.  This lead to duplicated
code and missed stpes of the PCC protocol. The first patch in this series makes
it possible for mailbox/pcc to manage the writing of the buffer prior to sending
messages.  It also fixes the notification of message transmission completion.

Previous Version:
https://lore.kernel.org/lkml/20250715001011.90534-1-admiyo@os.amperecomputing.com/

Changes in V24:
- Removed endianess for PCC header values
- Kept Column width to under 80 chars
- Typo in commit message
- Prereqisite patch for PCC buffer management was merged late in 6.17.
  See "mailbox/pcc: support mailbox management of the shared buffer"

Changes in V23:
- Trigger for direct management of shared buffer based on flag in pcc channel
- Only initialize rx_alloc for inbox, not outbox.
- Read value for requested IRQ flag out of channel's current_req
- unqueue an sk_buff that failed to send
- Move error handling for skb resize error inline instead of goto

Changes in V22:
- Direct management of the shared buffer in the mailbox layer.
- Proper checking of command complete flag prior to writing to the buffer.

Changes in V21:
- Use existing constants PCC_SIGNATURE and PCC_CMD_COMPLETION_NOTIFY
- Check return code on call to send_data and drop packet if failed
- use sizeof(*mctp_pcc_header) etc,  instead of structs for resizing buffers
- simplify check for ares->type != PCC_DWORD_TYPE
- simply return result devm_add_action_or_reset
- reduce initializer for  mctp_pcc_lookup_context context = {};
- move initialization of mbox dev into mctp_pcc_initialize_mailbox
- minor spacing changes

Changes in V20:
- corrected typo in RFC version
- removed spurious space
- tx spin lock only controls access to shared memory buffer
- tx spin lock not eheld on error condition
- tx returns OK if skb can't be expanded

Changes in V19:
- Rebased on changes to PCC mailbox handling
- checks for cloned SKB prior to transmission
- converted doulbe slash comments to C comments

Changes in V18:
- Added Acked-By
- Fix minor spacing issue

Changes in V17:
- No new changes. Rebased on net-next post 6.13 release.

Changes in V16:
- do not duplicate cleanup after devm_add_action_or_reset calls

Changes in V15:
- corrected indentation formatting error
- Corrected TABS issue in MAINTAINER entry

Changes in V14:
- Do not attempt to unregister a netdev that is never registered
- Added MAINTAINER entry

Changes in V13:
- Explicitly Convert PCC header from little endian to machine native

Changes in V12:
- Explicitly use little endian conversion for PCC header signature
- Builds clean with make C=1

Changes in V11:
- Explicitly use little endian types for PCC header

Changes in V11:
- Switch Big Endian data types to machine local for PCC header
- use mctp specific function for registering netdev

Changes in V10:
- sync with net-next branch
- use dstats helper functions
- remove duplicate drop stat
- remove more double spaces

Changes in V9:
- Prerequisite patch for PCC mailbox has been merged
- Stats collection now use helper functions
- many double spaces reduced to single

Changes in V8:
- change 0 to NULL for pointer check of shmem
- add semi for static version of pcc_mbox_ioremap
- convert pcc_mbox_ioremap function to static inline when client code is not being built
- remove shmem comment from struct pcc_chan_info descriptor
- copy rx_dropped in mctp_pcc_net_stats
- removed trailing newline on error message
- removed double space in dev_dbg string
- use big endian for header members
- Fix use full spec ID in description
- Fix typo in file description
- Form the complete outbound message in the sk_buff

Changes in V7:
- Removed the Hardware address as specification is not published.
- Map the shared buffer in the mailbox and share the mapped region with the driver
- Use the sk_buff memory to prepare the message before copying to shared region

Changes in V6:
- Removed patch for ACPICA code that has merged
- Includes the hardware address in the network device
- Converted all device resources to devm resources
- Removed mctp_pcc_driver_remove function
- uses acpi_driver_module for initialization
- created helper structure for in and out mailboxes
- Consolidated code for initializing mailboxes in the add_device function
- Added specification references
- Removed duplicate constant PCC_ACK_FLAG_MASK
- Use the MCTP_SIGNATURE_LENGTH define
- made naming of header structs consistent
- use sizeof local variables for offset calculations
- prefix structure name to avoid potential clash
- removed unnecessary null initialization from acpi_device_id

Changes in V5
- Removed Owner field from ACPI module declaration
- removed unused next field from struct mctp_pcc_ndev
- Corrected logic reading  RX ACK flag.
- Added comment for struct pcc_chan_info field shmem_base_addr
- check against current mtu instead of max mtu for packet length\
- removed unnecessary lookups of pnd->mdev.dev

Changes in V4
- Read flags out of shared buffer to trigger ACK for Type 4 RX
- Remove list of netdevs and cleanup from devices only
- tag PCCT protocol headers as little endian
- Remove unused constants

Changes in V3
- removed unused header
- removed spurious space
- removed spurious semis after functiomns
- removed null assignment for init
- remove redundant set of device on skb
- tabify constant declarations
- added  rtnl_link_stats64 function
- set MTU to minimum to start
- clean up logic on driver removal
- remove cast on void * assignment
- call cleanup function directly
- check received length before allocating skb
- introduce symbolic constatn for ACK FLAG MASK
- symbolic constant for PCC header flag.
- Add namespace ID to PCC magic
- replaced readls with copy from io of PCC header
- replaced custom modules init and cleanup with ACPI version

Changes in V2

- All Variable Declarations are in reverse Xmass Tree Format
- All Checkpatch Warnings Are Fixed
- Removed Dead code
- Added packet tx/rx stats
- Removed network physical address.  This is still in
  disucssion in the spec, and will be added once there
  is consensus. The protocol can be used with out it.
  This also lead to the removal of the Big Endian
  conversions.
- Avoided using non volatile pointers in copy to and from io space
- Reorderd the patches to put the ACK check for the PCC Mailbox
  as a pre-requisite.  The corresponding change for the MCTP
  driver has been inlined in the main patch.
- Replaced magic numbers with constants, fixed typos, and other
  minor changes from code review.

Adam Young (1):
  mctp pcc: Implement MCTP over PCC Transport

 MAINTAINERS                 |   5 +
 drivers/net/mctp/Kconfig    |  13 ++
 drivers/net/mctp/Makefile   |   1 +
 drivers/net/mctp/mctp-pcc.c | 348 ++++++++++++++++++++++++++++++++++++
 4 files changed, 367 insertions(+)
 create mode 100644 drivers/net/mctp/mctp-pcc.c

-- 
2.43.0


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v24 1/1] mctp pcc: Implement MCTP over PCC Transport
  2025-08-11 15:38 [PATCH v24 0/1] MCTP Over PCC Transport admiyo
@ 2025-08-11 15:38 ` admiyo
  2025-08-13  4:46   ` Jeremy Kerr
  0 siblings, 1 reply; 6+ messages in thread
From: admiyo @ 2025-08-11 15:38 UTC (permalink / raw)
  To: Jeremy Kerr, Matt Johnston, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Adam Young
  Cc: netdev, linux-kernel, Sudeep Holla, Jonathan Cameron, Huisong Li

From: Adam Young <admiyo@os.amperecomputing.com>

Implementation of network driver for
Management Control Transport Protocol(MCTP)
over Platform Communication Channel(PCC)

DMTF DSP:0292
https://www.dmtf.org/sites/default/files/standards/documents/\
DSP0292_1.0.0WIP50.pdf

MCTP devices are specified via ACPI by entries
in DSDT/SDST and reference channels specified
in the PCCT.  Messages are sent on a type 3 and
received on a type 4 channel.  Communication with
other devices use the PCC based doorbell mechanism;
a shared memory segment with a corresponding
interrupt and a memory register used to trigger
remote interrupts.

This driver takes advantage of PCC mailbox buffer
management. The data section of the struct sk_buff
that contains the outgoing packet is sent to the mailbox,
already properly formatted  as a PCC message.  The driver
is also responsible for allocating a struct sk_buff that
is then passed to the mailbox and used to record the
data in the shared buffer. It maintains a list of both
outging and incoming sk_buffs to match the data buffers
with the original sk_buffs.

When the Type 3 channel outbox receives a txdone response
interrupt, it consumes the outgoing sk_buff, allowing
it to be freed.

Signed-off-by: Adam Young <admiyo@os.amperecomputing.com>
---
 MAINTAINERS                 |   5 +
 drivers/net/mctp/Kconfig    |  13 ++
 drivers/net/mctp/Makefile   |   1 +
 drivers/net/mctp/mctp-pcc.c | 348 ++++++++++++++++++++++++++++++++++++
 4 files changed, 367 insertions(+)
 create mode 100644 drivers/net/mctp/mctp-pcc.c

diff --git a/MAINTAINERS b/MAINTAINERS
index fe168477caa4..71c5d017c15e 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -14661,6 +14661,11 @@ F:	include/net/mctpdevice.h
 F:	include/net/netns/mctp.h
 F:	net/mctp/
 
+MANAGEMENT COMPONENT TRANSPORT PROTOCOL (MCTP) over PCC (MCTP-PCC) Driver
+M:	Adam Young <admiyo@os.amperecomputing.com>
+S:	Maintained
+F:	drivers/net/mctp/mctp-pcc.c
+
 MAPLE TREE
 M:	Liam R. Howlett <Liam.Howlett@oracle.com>
 L:	maple-tree@lists.infradead.org
diff --git a/drivers/net/mctp/Kconfig b/drivers/net/mctp/Kconfig
index cf325ab0b1ef..f69d0237f058 100644
--- a/drivers/net/mctp/Kconfig
+++ b/drivers/net/mctp/Kconfig
@@ -57,6 +57,19 @@ config MCTP_TRANSPORT_USB
 	  MCTP-over-USB interfaces are peer-to-peer, so each interface
 	  represents a physical connection to one remote MCTP endpoint.
 
+config MCTP_TRANSPORT_PCC
+	tristate "MCTP PCC transport"
+	depends on ACPI
+	help
+	  Provides a driver to access MCTP devices over PCC transport,
+	  A MCTP protocol network device is created via ACPI for each
+	  entry in the DST/SDST that matches the identifier. The Platform
+	  communication channels are selected from the corresponding
+	  entries in the PCCT.
+
+	  Say y here if you need to connect to MCTP endpoints over PCC. To
+	  compile as a module, use m; the module will be called mctp-pcc.
+
 endmenu
 
 endif
diff --git a/drivers/net/mctp/Makefile b/drivers/net/mctp/Makefile
index c36006849a1e..2276f148df7c 100644
--- a/drivers/net/mctp/Makefile
+++ b/drivers/net/mctp/Makefile
@@ -1,3 +1,4 @@
+obj-$(CONFIG_MCTP_TRANSPORT_PCC) += mctp-pcc.o
 obj-$(CONFIG_MCTP_SERIAL) += mctp-serial.o
 obj-$(CONFIG_MCTP_TRANSPORT_I2C) += mctp-i2c.o
 obj-$(CONFIG_MCTP_TRANSPORT_I3C) += mctp-i3c.o
diff --git a/drivers/net/mctp/mctp-pcc.c b/drivers/net/mctp/mctp-pcc.c
new file mode 100644
index 000000000000..27af2838da37
--- /dev/null
+++ b/drivers/net/mctp/mctp-pcc.c
@@ -0,0 +1,348 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * mctp-pcc.c - Driver for MCTP over PCC.
+ * Copyright (c) 2024-2025, Ampere Computing LLC
+ *
+ */
+
+/* Implementation of MCTP over PCC DMTF Specification DSP0256
+ * https://www.dmtf.org/sites/default/files/standards/documents/DSP0292_1.0.0WIP50.pdf
+ */
+
+#include <linux/acpi.h>
+#include <linux/if_arp.h>
+#include <linux/init.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/netdevice.h>
+#include <linux/platform_device.h>
+#include <linux/string.h>
+#include <linux/skbuff.h>
+#include <linux/hrtimer.h>
+
+#include <acpi/acpi_bus.h>
+#include <acpi/acpi_drivers.h>
+#include <acpi/acrestyp.h>
+#include <acpi/actbl.h>
+#include <net/mctp.h>
+#include <net/mctpdevice.h>
+#include <acpi/pcc.h>
+
+#include "../../mailbox/mailbox.h"
+
+#define MCTP_PAYLOAD_LENGTH     256
+#define MCTP_CMD_LENGTH         4
+#define MCTP_PCC_VERSION        0x1 /* DSP0292 a single version: 1 */
+#define MCTP_SIGNATURE          "MCTP"
+#define MCTP_SIGNATURE_LENGTH   (sizeof(MCTP_SIGNATURE) - 1)
+#define MCTP_MIN_MTU            68
+#define PCC_DWORD_TYPE          0x0c
+
+struct mctp_pcc_mailbox {
+	u32 index;
+	struct pcc_mbox_chan *chan;
+	struct mbox_client client;
+	struct sk_buff_head packets;
+};
+
+/* The netdev structure. One of these per PCC adapter. */
+struct mctp_pcc_ndev {
+	/* spinlock to serialize access to PCC outbox buffer and registers
+	 * Note that what PCC calls registers are memory locations, not CPU
+	 * Registers.  They include the fields used to synchronize access
+	 * between the OS and remote endpoints.
+	 *
+	 * Only the Outbox needs a spinlock, to prevent multiple
+	 * sent packets triggering multiple attempts to over write
+	 * the outbox.  The Inbox buffer is controlled by the remote
+	 * service and a spinlock would have no effect.
+	 */
+	spinlock_t lock;
+	struct net_device *ndev;
+	struct acpi_device *acpi_device;
+	struct mctp_pcc_mailbox inbox;
+	struct mctp_pcc_mailbox outbox;
+};
+
+static void *mctp_pcc_rx_alloc(struct mbox_client *c, int size)
+{
+	struct mctp_pcc_ndev *mctp_pcc_ndev =
+		container_of(c, struct mctp_pcc_ndev, inbox.client);
+	struct mctp_pcc_mailbox *box = &mctp_pcc_ndev->inbox;
+	struct sk_buff *skb;
+
+	if (size > mctp_pcc_ndev->ndev->mtu)
+		return NULL;
+	skb = netdev_alloc_skb(mctp_pcc_ndev->ndev, size);
+	if (!skb)
+		return NULL;
+	skb_put(skb, size);
+	skb->protocol = htons(ETH_P_MCTP);
+
+	skb_queue_head(&box->packets, skb);
+
+	return skb->data;
+}
+
+static void mctp_pcc_client_rx_callback(struct mbox_client *c, void *buffer)
+{
+	struct mctp_pcc_ndev *mctp_pcc_ndev;
+	struct pcc_header pcc_header;
+	struct mctp_skb_cb *cb;
+	struct sk_buff *skb;
+
+	mctp_pcc_ndev = container_of(c, struct mctp_pcc_ndev, inbox.client);
+	if (!buffer) {
+		dev_dstats_rx_dropped(mctp_pcc_ndev->ndev);
+		return;
+	}
+
+	skb_queue_walk(&mctp_pcc_ndev->inbox.packets, skb) {
+		if (skb->data != buffer)
+			continue;
+		skb_unlink(skb, &mctp_pcc_ndev->inbox.packets);
+		dev_dstats_rx_add(mctp_pcc_ndev->ndev, skb->len);
+		skb_reset_mac_header(skb);
+		skb_pull(skb, sizeof(pcc_header));
+		skb_reset_network_header(skb);
+		cb = __mctp_cb(skb);
+		cb->halen = 0;
+		netif_rx(skb);
+		return;
+	}
+	pr_warn("Unmatched packet in mctp-pcc inbox packet list");
+}
+
+static void mctp_pcc_tx_done(struct mbox_client *c, void *mssg, int r)
+{
+	struct mctp_pcc_mailbox *box;
+	struct sk_buff *skb;
+
+	box = container_of(c, struct mctp_pcc_mailbox, client);
+	skb_queue_walk(&box->packets, skb) {
+		if (skb->data == mssg) {
+			skb_unlink(skb, &box->packets);
+			dev_consume_skb_any(skb);
+			break;
+		}
+	}
+}
+
+static netdev_tx_t mctp_pcc_tx(struct sk_buff *skb, struct net_device *ndev)
+{
+	struct mctp_pcc_ndev *mpnd = netdev_priv(ndev);
+	struct pcc_header *pcc_header;
+	int len = skb->len;
+	int rc;
+
+	rc = skb_cow_head(skb, sizeof(*pcc_header));
+	if (rc) {
+		dev_dstats_tx_dropped(ndev);
+		kfree_skb(skb);
+		return NETDEV_TX_OK;
+	}
+
+	pcc_header = skb_push(skb, sizeof(*pcc_header));
+	pcc_header->signature = PCC_SIGNATURE | mpnd->outbox.index;
+	pcc_header->flags = PCC_CMD_COMPLETION_NOTIFY;
+	memcpy(&pcc_header->command, MCTP_SIGNATURE, MCTP_SIGNATURE_LENGTH);
+	pcc_header->length = len + MCTP_SIGNATURE_LENGTH;
+	skb_queue_head(&mpnd->outbox.packets, skb);
+
+	rc = mbox_send_message(mpnd->outbox.chan->mchan, skb->data);
+
+	if (rc < 0) {
+		skb_unlink(skb, &mpnd->outbox.packets);
+		return NETDEV_TX_BUSY;
+	}
+
+	dev_dstats_tx_add(ndev, len);
+	return NETDEV_TX_OK;
+}
+
+static const struct net_device_ops mctp_pcc_netdev_ops = {
+	.ndo_start_xmit = mctp_pcc_tx,
+};
+
+static const struct mctp_netdev_ops mctp_netdev_ops = {
+	NULL
+};
+
+static void mctp_pcc_setup(struct net_device *ndev)
+{
+	ndev->type = ARPHRD_MCTP;
+	ndev->hard_header_len = 0;
+	ndev->tx_queue_len = 0;
+	ndev->flags = IFF_NOARP;
+	ndev->netdev_ops = &mctp_pcc_netdev_ops;
+	ndev->needs_free_netdev = true;
+	ndev->pcpu_stat_type = NETDEV_PCPU_STAT_DSTATS;
+}
+
+struct mctp_pcc_lookup_context {
+	int index;
+	u32 inbox_index;
+	u32 outbox_index;
+};
+
+static acpi_status lookup_pcct_indices(struct acpi_resource *ares,
+				       void *context)
+{
+	struct mctp_pcc_lookup_context *luc = context;
+	struct acpi_resource_address32 *addr;
+
+	if (ares->type != PCC_DWORD_TYPE)
+		return AE_OK;
+
+	addr = ACPI_CAST_PTR(struct acpi_resource_address32, &ares->data);
+	switch (luc->index) {
+	case 0:
+		luc->outbox_index = addr[0].address.minimum;
+		break;
+	case 1:
+		luc->inbox_index = addr[0].address.minimum;
+		break;
+	}
+	luc->index++;
+	return AE_OK;
+}
+
+static void drain_packets(struct sk_buff_head *list)
+{
+	struct sk_buff *skb;
+
+	while (!skb_queue_empty(list)) {
+		skb = skb_dequeue(list);
+		dev_consume_skb_any(skb);
+	}
+}
+
+static void mctp_cleanup_netdev(void *data)
+{
+	struct mctp_pcc_ndev *mctp_pcc_ndev;
+	struct net_device *ndev = data;
+
+	mctp_pcc_ndev = netdev_priv(ndev);
+	drain_packets(&mctp_pcc_ndev->outbox.packets);
+	drain_packets(&mctp_pcc_ndev->inbox.packets);
+
+	mctp_unregister_netdev(ndev);
+}
+
+static void mctp_cleanup_channel(void *data)
+{
+	struct pcc_mbox_chan *chan = data;
+
+	pcc_mbox_free_channel(chan);
+}
+
+static int mctp_pcc_initialize_mailbox(struct device *dev,
+				       struct mctp_pcc_mailbox *box, u32 index)
+{
+	box->index = index;
+	skb_queue_head_init(&box->packets);
+	box->chan = pcc_mbox_request_channel(&box->client, index);
+
+	box->client.dev = dev;
+	if (IS_ERR(box->chan))
+		return PTR_ERR(box->chan);
+	return devm_add_action_or_reset(dev, mctp_cleanup_channel, box->chan);
+}
+
+static int mctp_pcc_driver_add(struct acpi_device *acpi_dev)
+{
+	struct mctp_pcc_lookup_context context = {0};
+	struct mctp_pcc_ndev *mctp_pcc_ndev;
+	struct device *dev = &acpi_dev->dev;
+	struct net_device *ndev;
+	acpi_handle dev_handle;
+	acpi_status status;
+	int mctp_pcc_mtu;
+	char name[32];
+	int rc;
+
+	dev_dbg(dev, "Adding mctp_pcc device for HID %s\n",
+		acpi_device_hid(acpi_dev));
+	dev_handle = acpi_device_handle(acpi_dev);
+	status = acpi_walk_resources(dev_handle, "_CRS", lookup_pcct_indices,
+				     &context);
+	if (!ACPI_SUCCESS(status)) {
+		dev_err(dev, "FAILURE to lookup PCC indexes from CRS\n");
+		return -EINVAL;
+	}
+
+	snprintf(name, sizeof(name), "mctpipcc%d", context.inbox_index);
+	ndev = alloc_netdev(sizeof(*mctp_pcc_ndev), name, NET_NAME_PREDICTABLE,
+			    mctp_pcc_setup);
+	if (!ndev)
+		return -ENOMEM;
+
+	mctp_pcc_ndev = netdev_priv(ndev);
+	spin_lock_init(&mctp_pcc_ndev->lock);
+
+	/* inbox initialization */
+	rc = mctp_pcc_initialize_mailbox(dev, &mctp_pcc_ndev->inbox,
+					 context.inbox_index);
+	if (rc)
+		goto free_netdev;
+
+	mctp_pcc_ndev->inbox.chan->rx_alloc = mctp_pcc_rx_alloc;
+	mctp_pcc_ndev->inbox.client.rx_callback = mctp_pcc_client_rx_callback;
+
+	/* outbox initialization */
+	rc = mctp_pcc_initialize_mailbox(dev, &mctp_pcc_ndev->outbox,
+					 context.outbox_index);
+	if (rc)
+		goto free_netdev;
+
+	mctp_pcc_ndev->outbox.chan->manage_writes = true;
+	mctp_pcc_ndev->outbox.client.tx_done = mctp_pcc_tx_done;
+	mctp_pcc_ndev->acpi_device = acpi_dev;
+	mctp_pcc_ndev->ndev = ndev;
+	acpi_dev->driver_data = mctp_pcc_ndev;
+
+	/* There is no clean way to pass the MTU to the callback function
+	 * used for registration, so set the values ahead of time.
+	 */
+	mctp_pcc_mtu = mctp_pcc_ndev->outbox.chan->shmem_size -
+		sizeof(struct pcc_header);
+	ndev->mtu = MCTP_MIN_MTU;
+	ndev->max_mtu = mctp_pcc_mtu;
+	ndev->min_mtu = MCTP_MIN_MTU;
+
+	/* ndev needs to be freed before the iomemory (mapped above) gets
+	 * unmapped,  devm resources get freed in reverse to the order they
+	 * are added.
+	 */
+	rc = mctp_register_netdev(ndev, &mctp_netdev_ops,
+				  MCTP_PHYS_BINDING_PCC);
+	if (rc)
+		goto free_netdev;
+
+	return devm_add_action_or_reset(dev, mctp_cleanup_netdev, ndev);
+free_netdev:
+	free_netdev(ndev);
+	return rc;
+}
+
+static const struct acpi_device_id mctp_pcc_device_ids[] = {
+	{ "DMT0001" },
+	{}
+};
+
+static struct acpi_driver mctp_pcc_driver = {
+	.name = "mctp_pcc",
+	.class = "Unknown",
+	.ids = mctp_pcc_device_ids,
+	.ops = {
+		.add = mctp_pcc_driver_add,
+	},
+};
+
+module_acpi_driver(mctp_pcc_driver);
+
+MODULE_DEVICE_TABLE(acpi, mctp_pcc_device_ids);
+
+MODULE_DESCRIPTION("MCTP PCC ACPI device");
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Adam Young <admiyo@os.amperecomputing.com>");
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v24 1/1] mctp pcc: Implement MCTP over PCC Transport
  2025-08-11 15:38 ` [PATCH v24 1/1] mctp pcc: Implement MCTP over " admiyo
@ 2025-08-13  4:46   ` Jeremy Kerr
  2025-08-14 17:56     ` Adam Young
  2025-08-15 22:27     ` Adam Young
  0 siblings, 2 replies; 6+ messages in thread
From: Jeremy Kerr @ 2025-08-13  4:46 UTC (permalink / raw)
  To: admiyo, Matt Johnston, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni
  Cc: netdev, linux-kernel, Sudeep Holla, Jonathan Cameron, Huisong Li

Hi Adam,

From the MCTP device side, mostly looks okay, but there are some things
that may need work on the mailbox interface. As I have mentioned
earlier, I'm not super familiar with that, so some of these may be more
clarifications on my side rather than changes...

I see that this is failing to build on net-next, as the mailbox changes
are not present there. You may need to coordinate for the merge.

> +#define MCTP_PAYLOAD_LENGTH     256
> +#define MCTP_CMD_LENGTH         4
> +#define MCTP_PCC_VERSION        0x1 /* DSP0292 a single version: 1 */

Regarding the comment: I don't think DSP0292 does define a version for
the transport data format? There's the DSP0236 header version, but
that's distinct from the transport implementation. You don't seem to
use these three now anyway, perhaps drop.

And just confirming: the pcc header format is now all host-endian - or
is that a firmware-specified endianness? (what happens if the OS may
boot in either BE or LE? is that at all possible for any PCC-capable
hardware, or are we otherwise guaranteed that we're the same endianness
as the consumer?)

> +struct mctp_pcc_mailbox {
> +       u32 index;
> +       struct pcc_mbox_chan *chan;
> +       struct mbox_client client;
> +       struct sk_buff_head packets;

If you're ever able to stash a context pointer in mailbox requests, that
would give you a great opportunity to remove the complexity of
maintaining this list.

No change required now necessarily, just something to track for later
simplification.

> +};
> +
> +/* The netdev structure. One of these per PCC adapter. */
> +struct mctp_pcc_ndev {
> +       /* spinlock to serialize access to PCC outbox buffer and registers
> +        * Note that what PCC calls registers are memory locations, not CPU
> +        * Registers.  They include the fields used to synchronize access
> +        * between the OS and remote endpoints.
> +        *
> +        * Only the Outbox needs a spinlock, to prevent multiple
> +        * sent packets triggering multiple attempts to over write
> +        * the outbox.  The Inbox buffer is controlled by the remote
> +        * service and a spinlock would have no effect.
> +        */
> +       spinlock_t lock;

You don't use this anywhere, but I think you do need synchronisation
still. See my comment on tx_done.

> +       struct net_device *ndev;
> +       struct acpi_device *acpi_device;
> +       struct mctp_pcc_mailbox inbox;
> +       struct mctp_pcc_mailbox outbox;
> +};
> +
> +static void *mctp_pcc_rx_alloc(struct mbox_client *c, int size)
> +{
> +       struct mctp_pcc_ndev *mctp_pcc_ndev =
> +               container_of(c, struct mctp_pcc_ndev, inbox.client);
> +       struct mctp_pcc_mailbox *box = &mctp_pcc_ndev->inbox;
> +       struct sk_buff *skb;
> +
> +       if (size > mctp_pcc_ndev->ndev->mtu)
> +               return NULL;
> +       skb = netdev_alloc_skb(mctp_pcc_ndev->ndev, size);
> +       if (!skb)
> +               return NULL;
> +       skb_put(skb, size);
> +       skb->protocol = htons(ETH_P_MCTP);
> +
> +       skb_queue_head(&box->packets, skb);
> +
> +       return skb->data;
> +}
> +
> +static void mctp_pcc_client_rx_callback(struct mbox_client *c, void *buffer)
> +{
> +       struct mctp_pcc_ndev *mctp_pcc_ndev;
> +       struct pcc_header pcc_header;
> +       struct mctp_skb_cb *cb;
> +       struct sk_buff *skb;
> +
> +       mctp_pcc_ndev = container_of(c, struct mctp_pcc_ndev, inbox.client);
> +       if (!buffer) {
> +               dev_dstats_rx_dropped(mctp_pcc_ndev->ndev);
> +               return;
> +       }
> +
> +       skb_queue_walk(&mctp_pcc_ndev->inbox.packets, skb) {
> +               if (skb->data != buffer)
> +                       continue;
> +               skb_unlink(skb, &mctp_pcc_ndev->inbox.packets);
> +               dev_dstats_rx_add(mctp_pcc_ndev->ndev, skb->len);
> +               skb_reset_mac_header(skb);
> +               skb_pull(skb, sizeof(pcc_header));
> +               skb_reset_network_header(skb);
> +               cb = __mctp_cb(skb);
> +               cb->halen = 0;
> +               netif_rx(skb);
> +               return;
> +       }
> +       pr_warn("Unmatched packet in mctp-pcc inbox packet list");

Minor: would be helpful to include the device instance info here.
Perhaps netdev_warn()?

> +}
> +
> +static void mctp_pcc_tx_done(struct mbox_client *c, void *mssg, int r)
> +{
> +       struct mctp_pcc_mailbox *box;
> +       struct sk_buff *skb;
> +
> +       box = container_of(c, struct mctp_pcc_mailbox, client);
> +       skb_queue_walk(&box->packets, skb) {
> +               if (skb->data == mssg) {
> +                       skb_unlink(skb, &box->packets);
> +                       dev_consume_skb_any(skb);
> +                       break;
> +               }
> +       }

How are updates to this queue synchronised against concurrent start_xmit?

Similarly, for the RX path: what about drain_packets() vs. RX
alloc/completion - is there any guarantee that we have quiesced the rx
channel before mctp_cleanup_netdev() runs? It seems like this currently
depends on ordering of the devm cleanup actions.

> +}
> +
> +static netdev_tx_t mctp_pcc_tx(struct sk_buff *skb, struct net_device *ndev)
> +{
> +       struct mctp_pcc_ndev *mpnd = netdev_priv(ndev);
> +       struct pcc_header *pcc_header;
> +       int len = skb->len;
> +       int rc;
> +
> +       rc = skb_cow_head(skb, sizeof(*pcc_header));
> +       if (rc) {
> +               dev_dstats_tx_dropped(ndev);
> +               kfree_skb(skb);
> +               return NETDEV_TX_OK;
> +       }
> +
> +       pcc_header = skb_push(skb, sizeof(*pcc_header));
> +       pcc_header->signature = PCC_SIGNATURE | mpnd->outbox.index;
> +       pcc_header->flags = PCC_CMD_COMPLETION_NOTIFY;
> +       memcpy(&pcc_header->command, MCTP_SIGNATURE, MCTP_SIGNATURE_LENGTH);
> +       pcc_header->length = len + MCTP_SIGNATURE_LENGTH;
> +       skb_queue_head(&mpnd->outbox.packets, skb);
> +
> +       rc = mbox_send_message(mpnd->outbox.chan->mchan, skb->data);
> +
> +       if (rc < 0) {
> +               skb_unlink(skb, &mpnd->outbox.packets);
> +               return NETDEV_TX_BUSY;
> +       }

Is the lack of flow-control here an issue? Is there any facility for the
mailbox API to give you an indication of tx availability, or should you be
tracking that in the number of pending TX messages?

Or, in general: what are you expectations on retransmit for this
NETDEV_TX_BUSY case?


> +
> +       dev_dstats_tx_add(ndev, len);
> +       return NETDEV_TX_OK;
> +}
> +
> +static const struct net_device_ops mctp_pcc_netdev_ops = {
> +       .ndo_start_xmit = mctp_pcc_tx,
> +};
> +
> +static const struct mctp_netdev_ops mctp_netdev_ops = {
> +       NULL
> +};

Minor: mctp_register_netdev() allows passing NULL for the ops argument,
you don't need this empty ops struct.

> +static void drain_packets(struct sk_buff_head *list)
> +{
> +       struct sk_buff *skb;
> +
> +       while (!skb_queue_empty(list)) {
> +               skb = skb_dequeue(list);
> +               dev_consume_skb_any(skb);
> +       }
> +}
> +
> +static void mctp_cleanup_netdev(void *data)
> +{
> +       struct mctp_pcc_ndev *mctp_pcc_ndev;
> +       struct net_device *ndev = data;
> +
> +       mctp_pcc_ndev = netdev_priv(ndev);
> +       drain_packets(&mctp_pcc_ndev->outbox.packets);
> +       drain_packets(&mctp_pcc_ndev->inbox.packets);

You're draining these queues on device removal. I suspect you may want
to do this on ndo_stop instead.

This would prevent stale skbs being transferred over a change in network
namespace, which is probably a good thing, but may depend on details
of the PCC channel behaviour.

Cheers,


Jeremy

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v24 1/1] mctp pcc: Implement MCTP over PCC Transport
  2025-08-13  4:46   ` Jeremy Kerr
@ 2025-08-14 17:56     ` Adam Young
  2025-08-18  0:33       ` Jeremy Kerr
  2025-08-15 22:27     ` Adam Young
  1 sibling, 1 reply; 6+ messages in thread
From: Adam Young @ 2025-08-14 17:56 UTC (permalink / raw)
  To: Jeremy Kerr, admiyo, Matt Johnston, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: netdev, linux-kernel, Sudeep Holla, Jonathan Cameron, Huisong Li


On 8/13/25 00:46, Jeremy Kerr wrote:
> And just confirming: the pcc header format is now all host-endian - or
> is that a firmware-specified endianness? (what happens if the OS may
> boot in either BE or LE? is that at all possible for any PCC-capable
> hardware, or are we otherwise guaranteed that we're the same endianness
> as the consumer?)

The specification does not specify endianess.  It is not addressed in 
the ACPI  PCC spec nor the MCTP over PCC spec. There does not seem  to 
be any endianess modifiers in any of the ACPI code base.  Thus I have 
removed any references to it.  Since PCC is on a single SOC with a 
shared memory setup, the endianess would have to be matched between both 
ends of the system or you would have memory corruption.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v24 1/1] mctp pcc: Implement MCTP over PCC Transport
  2025-08-13  4:46   ` Jeremy Kerr
  2025-08-14 17:56     ` Adam Young
@ 2025-08-15 22:27     ` Adam Young
  1 sibling, 0 replies; 6+ messages in thread
From: Adam Young @ 2025-08-15 22:27 UTC (permalink / raw)
  To: Jeremy Kerr, admiyo, Matt Johnston, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: netdev, linux-kernel, Sudeep Holla, Jonathan Cameron, Huisong Li


On 8/13/25 00:46, Jeremy Kerr wrote:
> I see that this is failing to build on net-next, as the mailbox changes
> are not present there. You may need to coordinate for the merge.


Does Net-next get resynced when 6.17 advances?  I can keep posting the 
mailbox patch if that makes things progress.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v24 1/1] mctp pcc: Implement MCTP over PCC Transport
  2025-08-14 17:56     ` Adam Young
@ 2025-08-18  0:33       ` Jeremy Kerr
  0 siblings, 0 replies; 6+ messages in thread
From: Jeremy Kerr @ 2025-08-18  0:33 UTC (permalink / raw)
  To: Adam Young, admiyo, Matt Johnston, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: netdev, linux-kernel, Sudeep Holla, Jonathan Cameron, Huisong Li

Hi Adam,

> > And just confirming: the pcc header format is now all host-endian - or
> > is that a firmware-specified endianness? (what happens if the OS may
> > boot in either BE or LE? is that at all possible for any PCC-capable
> > hardware, or are we otherwise guaranteed that we're the same endianness
> > as the consumer?)
> 
> The specification does not specify endianess.  It is not addressed in
> the ACPI  PCC spec nor the MCTP over PCC spec. There does not seem  to 
> be any endianess modifiers in any of the ACPI code base.

There are a few cases of endian conversion, they appear to be cases
where a pre-defined data format is used, and only where that data is BE.

That's a pretty strong indication that we don't need to worry about host
endian != firmware endian (I'm assuming that firmware is the entity
that parses the PCC headers you're constructing). All sounds good then.

Cheers,


Jeremy

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2025-08-18  0:33 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-11 15:38 [PATCH v24 0/1] MCTP Over PCC Transport admiyo
2025-08-11 15:38 ` [PATCH v24 1/1] mctp pcc: Implement MCTP over " admiyo
2025-08-13  4:46   ` Jeremy Kerr
2025-08-14 17:56     ` Adam Young
2025-08-18  0:33       ` Jeremy Kerr
2025-08-15 22:27     ` Adam Young

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).