From: Aniruddha Rao <anrao@nvidia.com>
To: <thierry.reding@kernel.org>, <jonathanh@nvidia.com>
Cc: <linux-tegra@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
"Aniruddha Rao" <anrao@nvidia.com>
Subject: [PATCH 6/6] firmware: tegra: bpmp: Add Tegra410 MBWT sysfs interface
Date: Mon, 15 Jun 2026 08:37:31 +0000 [thread overview]
Message-ID: <20260615083731.888055-7-anrao@nvidia.com> (raw)
In-Reply-To: <20260615083731.888055-1-anrao@nvidia.com>
Different workloads can place different memory-bandwidth demands on the
system. Selecting an appropriate bandwidth limit depends on the runtime
workload mix and on the devices carrying that traffic, such as PCIe
devices or a GPU connected over chip-to-chip link (Nv-Clink). That
information is not available to the kernel.
Tegra410 provides Memory Bandwidth Throttler (MBWT) controls for PCIe
and GPU connected over chip-to-chip link (Nv-Clink) traffic on the path
to DRAM. Each PCIe bandwidth group has a single shared cap for all
traffic in that group. A group may contain only PCIe devices, only a GPU
over Nv-Clink, or both PCIe and GPU traffic in a bifurcated topology.
Bandwidth for a group can be set per traffic type (PCIe Read, PCIe
Write, GPU over Nv-Clink).
Add a sysfs attribute on the tegra-bpmp platform device to expose a
narrow userspace interface/tuning knob for MBWT control.
A write to the attribute accepts a comma-separated tuple of
instance,vc_type,bandwidth. The instance field identifies the PCIe
bandwidth group. The vc_type field selects PCIe read, PCIe write, or GPU
over Nv-Clink traffic for that group, and bandwidth specifies the target
bandwidth limit to program. A read from the attribute queries firmware
for each documented instance and vc_type combination and returns one
tuple per line. Reads do not depend on a previous write. If firmware
returns an error for any GET_BW request, the read fails with that error.
For example, writing 0,1,100 programs a 100 GB/s cap for PCIe write
traffic in PCIe bandwidth group 0. A subsequent read reports the
firmware-returned tuples, including the bandwidth value reported by
firmware for PCIe write traffic in group 0.
Since Tegra410 is an ACPI-only platform, register the attribute only on
ACPI systems and only when BPMP firmware reports support for the MBWT
GET_BW and SET_BW requests through its query ABI.
Signed-off-by: Aniruddha Rao <anrao@nvidia.com>
---
.../ABI/testing/sys-platform-tegra-bpmp | 51 +++++
drivers/firmware/tegra/Makefile | 1 +
drivers/firmware/tegra/bpmp-private.h | 2 +
drivers/firmware/tegra/bpmp-tegra-sysfs.c | 210 ++++++++++++++++++
drivers/firmware/tegra/bpmp.c | 6 +
5 files changed, 270 insertions(+)
create mode 100644 Documentation/ABI/testing/sys-platform-tegra-bpmp
create mode 100644 drivers/firmware/tegra/bpmp-tegra-sysfs.c
diff --git a/Documentation/ABI/testing/sys-platform-tegra-bpmp b/Documentation/ABI/testing/sys-platform-tegra-bpmp
new file mode 100644
index 000000000000..2c08051ac39d
--- /dev/null
+++ b/Documentation/ABI/testing/sys-platform-tegra-bpmp
@@ -0,0 +1,51 @@
+What: /sys/bus/platform/devices/<bpmp-device>/bandwidth
+Date: June 2026
+KernelVersion: 7.1
+Contact: Aniruddha TVS Rao <anrao@nvidia.com>
+Description:
+ Provides access to the Tegra410 Memory Bandwidth Throttler
+ (MBWT) control exposed by BPMP firmware for PCIe and GPU
+ connected over chip-to-chip link (Nv-Clink) traffic on the
+ path to DRAM.
+
+ The attribute is present only on ACPI-based Tegra410 systems
+ and only when BPMP firmware reports support for the MBWT
+ GET_BW and SET_BW requests through its query ABI.
+
+ Each PCIe bandwidth group has a single shared cap for all
+ traffic in that group. A group may contain only PCIe devices,
+ only a GPU over Nv-Clink, or both PCIe and GPU traffic in a
+ bifurcated topology. Bandwidth for a group can be set per
+ traffic type (PCIe Read, PCIe Write, GPU over Nv-Clink).
+
+ A write accepts a comma-separated tuple of
+ "instance,vc_type,bandwidth". Spaces around comma-separated
+ fields are allowed.
+
+ instance identifies the PCIe bandwidth group. Valid values are
+ 0-5, where 0 = pcie0, 1 = pcie1, ..., 5 = pcie5.
+
+ vc_type selects the traffic type for the selected group:
+ 0 = PCIe read
+ 1 = PCIe write
+ 2 = GPU over Nv-Clink
+
+ bandwidth specifies the target bandwidth cap in GB/s. Values
+ outside 1-110 (inclusive) are rejected by the driver before
+ issuing MBWT_SET. Firmware may still reject values within
+ that range.
+
+ A read queries firmware for each documented instance and
+ vc_type combination, and returns one tuple per line. Reads
+ do not depend on a previous write. If firmware returns an
+ error for any GET_BW request, the read fails with that error.
+
+ Example:
+ echo 0,1,100 > .../bandwidth
+ cat .../bandwidth
+ 0,0,100
+ 0,1,100
+ ...
+
+Users: Platform integration and bandwidth tuning on ACPI-based
+ Tegra410 systems (PCIe and GPU over Nv-Clink caps).
diff --git a/drivers/firmware/tegra/Makefile b/drivers/firmware/tegra/Makefile
index 4310cc0ff294..2fd060c66523 100644
--- a/drivers/firmware/tegra/Makefile
+++ b/drivers/firmware/tegra/Makefile
@@ -5,6 +5,7 @@ tegra-bpmp-$(CONFIG_ARCH_TEGRA_186_SOC) += bpmp-tegra186.o
tegra-bpmp-$(CONFIG_ARCH_TEGRA_194_SOC) += bpmp-tegra186.o
tegra-bpmp-$(CONFIG_ARCH_TEGRA_234_SOC) += bpmp-tegra186.o
tegra-bpmp-$(CONFIG_ARCH_TEGRA_410_SOC) += bpmp-tegra410.o
+tegra-bpmp-y += bpmp-tegra-sysfs.o
tegra-bpmp-$(CONFIG_ARCH_TEGRA_264_SOC) += bpmp-tegra186.o
tegra-bpmp-$(CONFIG_DEBUG_FS) += bpmp-debugfs.o
obj-$(CONFIG_TEGRA_BPMP) += tegra-bpmp.o
diff --git a/drivers/firmware/tegra/bpmp-private.h b/drivers/firmware/tegra/bpmp-private.h
index c3f466ae5979..ae26cc203aa4 100644
--- a/drivers/firmware/tegra/bpmp-private.h
+++ b/drivers/firmware/tegra/bpmp-private.h
@@ -66,4 +66,6 @@ static inline bool tegra410_bpmp_mbwt_cmd_is_supported(struct tegra_bpmp *bpmp,
}
#endif
+int tegra_bpmp_sysfs_register(struct tegra_bpmp *bpmp);
+
#endif
diff --git a/drivers/firmware/tegra/bpmp-tegra-sysfs.c b/drivers/firmware/tegra/bpmp-tegra-sysfs.c
new file mode 100644
index 000000000000..150c189ba8da
--- /dev/null
+++ b/drivers/firmware/tegra/bpmp-tegra-sysfs.c
@@ -0,0 +1,210 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2026, NVIDIA CORPORATION.
+ */
+
+#include <linux/acpi.h>
+#include <linux/device.h>
+#include <linux/kernel.h>
+#include <linux/kstrtox.h>
+#include <linux/mutex.h>
+#include <linux/slab.h>
+#include <linux/string.h>
+#include <linux/sysfs.h>
+
+#include <soc/tegra/bpmp.h>
+#include <soc/tegra/bpmp-abi.h>
+
+#include "bpmp-private.h"
+
+/* Documented sysfs / ABI bounds; firmware may still reject a request. */
+#define TEGRA_BPMP_MBWT_INSTANCE_MAX 5U
+#define TEGRA_BPMP_MBWT_VC_MAX 2U
+#define TEGRA_BPMP_MBWT_BW_MIN 1U
+#define TEGRA_BPMP_MBWT_BW_MAX 110U
+
+struct tegra_bpmp_mbwt_sysfs {
+ struct device_attribute dev_attr;
+ struct tegra_bpmp *bpmp;
+ /* Serializes bandwidth I/O. */
+ struct mutex lock;
+};
+
+static struct tegra_bpmp_mbwt_sysfs *
+tegra_bpmp_mbwt_sysfs_from_attr(struct device_attribute *attr)
+{
+ return container_of(attr, struct tegra_bpmp_mbwt_sysfs, dev_attr);
+}
+
+static int tegra_bpmp_mbwt_valid_tuple(unsigned int instance,
+ unsigned int vc_type,
+ unsigned int bandwidth)
+{
+ if (instance > TEGRA_BPMP_MBWT_INSTANCE_MAX)
+ return -EINVAL;
+ if (vc_type > TEGRA_BPMP_MBWT_VC_MAX)
+ return -EINVAL;
+ if (bandwidth < TEGRA_BPMP_MBWT_BW_MIN ||
+ bandwidth > TEGRA_BPMP_MBWT_BW_MAX)
+ return -EINVAL;
+
+ return 0;
+}
+
+static int tegra_bpmp_mbwt_parse(const char *buf, size_t count,
+ unsigned int *instance,
+ unsigned int *vc_type,
+ unsigned int *bandwidth)
+{
+ unsigned int values[3];
+ char *copy, *cur, *tok;
+ unsigned int i = 0;
+ int err = 0;
+
+ copy = kmemdup_nul(buf, count, GFP_KERNEL);
+ if (!copy)
+ return -ENOMEM;
+
+ cur = strim(copy);
+ while ((tok = strsep(&cur, ",")) != NULL) {
+ if (i >= ARRAY_SIZE(values)) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ tok = strim(tok);
+ if (!*tok) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ err = kstrtou32(tok, 0, &values[i]);
+ if (err)
+ goto out;
+
+ i++;
+ }
+
+ if (i != ARRAY_SIZE(values)) {
+ err = -EINVAL;
+ goto out;
+ }
+
+ *instance = values[0];
+ *vc_type = values[1];
+ *bandwidth = values[2];
+ err = 0;
+
+out:
+ kfree(copy);
+ return err;
+}
+
+static ssize_t bandwidth_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct tegra_bpmp_mbwt_sysfs *mbwt;
+ unsigned int instance, vc_type, bandwidth;
+ ssize_t len = 0;
+ int err;
+
+ mbwt = tegra_bpmp_mbwt_sysfs_from_attr(attr);
+
+ mutex_lock(&mbwt->lock);
+ for (instance = 0; instance <= TEGRA_BPMP_MBWT_INSTANCE_MAX; instance++) {
+ for (vc_type = 0; vc_type <= TEGRA_BPMP_MBWT_VC_MAX; vc_type++) {
+ err = tegra410_bpmp_mbwt_get(mbwt->bpmp, instance,
+ vc_type, &bandwidth);
+ if (err) {
+ mutex_unlock(&mbwt->lock);
+ return err;
+ }
+
+ len += sysfs_emit_at(buf, len, "%u,%u,%u\n", instance,
+ vc_type, bandwidth);
+ }
+ }
+ mutex_unlock(&mbwt->lock);
+
+ return len;
+}
+
+static ssize_t bandwidth_store(struct device *dev,
+ struct device_attribute *attr, const char *buf,
+ size_t count)
+{
+ struct tegra_bpmp_mbwt_sysfs *mbwt;
+ unsigned int instance, vc_type, bandwidth;
+ int err;
+
+ err = tegra_bpmp_mbwt_parse(buf, count, &instance, &vc_type,
+ &bandwidth);
+ if (err)
+ return err;
+
+ err = tegra_bpmp_mbwt_valid_tuple(instance, vc_type, bandwidth);
+ if (err)
+ return err;
+
+ mbwt = tegra_bpmp_mbwt_sysfs_from_attr(attr);
+
+ mutex_lock(&mbwt->lock);
+ err = tegra410_bpmp_mbwt_set(mbwt->bpmp, instance, vc_type, bandwidth);
+ mutex_unlock(&mbwt->lock);
+ if (err)
+ return err;
+
+ return count;
+}
+
+static void tegra_bpmp_mbwt_sysfs_teardown(void *data)
+{
+ struct tegra_bpmp_mbwt_sysfs *mbwt = data;
+
+ device_remove_file(mbwt->bpmp->dev, &mbwt->dev_attr);
+}
+
+int tegra_bpmp_sysfs_register(struct tegra_bpmp *bpmp)
+{
+ struct tegra_bpmp_mbwt_sysfs *mbwt;
+ int err;
+
+ if (!ACPI_HANDLE(bpmp->dev))
+ return 0;
+
+ if (!tegra_bpmp_mrq_is_supported(bpmp, MRQ_SOCHUB_MBWT))
+ return 0;
+
+ /*
+ * MRQ_QUERY_ABI only confirms that the MBWT MRQ is implemented. The
+ * firmware reports GET_BW / SET_BW support through the MBWT ABI query.
+ */
+ if (!tegra410_bpmp_mbwt_cmd_is_supported(bpmp, CMD_SOCHUB_MBWT_GET_BW) ||
+ !tegra410_bpmp_mbwt_cmd_is_supported(bpmp, CMD_SOCHUB_MBWT_SET_BW))
+ return 0;
+
+ mbwt = devm_kzalloc(bpmp->dev, sizeof(*mbwt), GFP_KERNEL);
+ if (!mbwt)
+ return -ENOMEM;
+
+ mbwt->bpmp = bpmp;
+ mutex_init(&mbwt->lock);
+
+ sysfs_attr_init(&mbwt->dev_attr.attr);
+ mbwt->dev_attr.attr.name = "bandwidth";
+ mbwt->dev_attr.attr.mode = 0644;
+ mbwt->dev_attr.show = bandwidth_show;
+ mbwt->dev_attr.store = bandwidth_store;
+
+ err = device_create_file(bpmp->dev, &mbwt->dev_attr);
+ if (err)
+ return err;
+
+ err = devm_add_action(bpmp->dev, tegra_bpmp_mbwt_sysfs_teardown, mbwt);
+ if (err) {
+ device_remove_file(bpmp->dev, &mbwt->dev_attr);
+ return err;
+ }
+
+ return 0;
+}
diff --git a/drivers/firmware/tegra/bpmp.c b/drivers/firmware/tegra/bpmp.c
index e9c0d6d3e24d..f637bdaa57cd 100644
--- a/drivers/firmware/tegra/bpmp.c
+++ b/drivers/firmware/tegra/bpmp.c
@@ -982,6 +982,12 @@ static int tegra_bpmp_probe(struct platform_device *pdev)
if (err < 0)
goto free_mrq;
+ err = tegra_bpmp_sysfs_register(bpmp);
+ if (err < 0)
+ dev_err(&pdev->dev,
+ "Failed registering sysfs attribute to the BPMP platform device: %d\n",
+ err);
+
err = tegra_bpmp_init_debugfs(bpmp);
if (err < 0)
dev_err(&pdev->dev, "debugfs initialization failed: %d\n", err);
--
2.43.0
prev parent reply other threads:[~2026-06-15 8:38 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-15 8:37 [PATCH 0/6] firmware: tegra: bpmp: Add Tegra410 ACPI MBWT support Aniruddha Rao
2026-06-15 8:37 ` [PATCH 1/6] soc/tegra: Add Tegra410 SoC Kconfig symbol Aniruddha Rao
2026-06-15 8:37 ` [PATCH 2/6] firmware: tegra: bpmp: Move channel, resource init to helper Aniruddha Rao
2026-06-15 8:37 ` [PATCH 3/6] firmware: tegra: bpmp: Add ACPI support Aniruddha Rao
2026-06-15 8:37 ` [PATCH 4/6] firmware: tegra: bpmp: Add the Memory Bandwidth Throttler ABI definitions Aniruddha Rao
2026-06-15 8:37 ` [PATCH 5/6] firmware: tegra: bpmp: Add Tegra410 MBWT BPMP helpers Aniruddha Rao
2026-06-15 8:37 ` Aniruddha Rao [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260615083731.888055-7-anrao@nvidia.com \
--to=anrao@nvidia.com \
--cc=jonathanh@nvidia.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-tegra@vger.kernel.org \
--cc=thierry.reding@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox