* Re: [RFC 00/12] PCI: Add support for Scalable I/O Virtualization
From: Jason Gunthorpe @ 2026-06-04 18:20 UTC (permalink / raw)
To: Dimitri Daskalakis
Cc: Bjorn Helgaas, linux-pci, Madhavan Srinivasan, Michael Ellerman,
Nicholas Piggin, Christophe Leroy, Mahesh J Salgaonkar,
Oliver O'Halloran, Niklas Schnelle, Gerald Schaefer,
Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
Christian Borntraeger, Sven Schnelle, Alex Williamson, Kevin Tian,
Ankit Agrawal, Leon Romanovsky, Juergen Gross, Stefano Stabellini,
Oleksandr Tyshchenko, Keith Busch, Alexander Duyck,
Jakub Kicinski, Dimitri Daskalakis, linuxppc-dev, linux-s390, kvm,
xen-devel
In-Reply-To: <20260604150153.3619662-1-dimitri.daskalakis1@gmail.com>
On Thu, Jun 04, 2026 at 08:01:41AM -0700, Dimitri Daskalakis wrote:
> With this patchset core enumarates the SIOV capability and can identify
> SIOV PFs. But there is no central mechanism to allocate/manage SIOV VFs.
> To support device pass through, devices will need to add a vfio-mdev
> driver with IOMMUFD support (or something similar).
There is an enormous amount of missing work to do something useful
with the SIOVr2 stuff. IIRC there is even supposed to be BIOS
components in this plan and there are some missing PCI SIG topics too
IIRC.
So, I'm not sure how much value there is in merging just the cap
discovery without a roadmap for the missing parts..
Also, I'm quite surprised to see this out of the blue, there is an OCP
workstream that was building out a standard that outlines how all the
different components have to act to successfully implement it. What
is in PCI SIG was just some minor foundational adjustments without any
context on how to form them into a solution.
I think it is extremely premature to merge anything related to SIOV to
the kernel. Join the OCP work stream if you are interested. I think
the general feeling was there is not sufficient interest in the
industry to do this and it has gone quiet.
Jason
^ permalink raw reply
* Re: [PATCH v5 05/20] dma-pool: track decrypted atomic pools and select them via attrs
From: Jason Gunthorpe @ 2026-06-04 18:24 UTC (permalink / raw)
To: Aneesh Kumar K.V
Cc: Michael Kelley, iommu@lists.linux.dev,
linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, linux-coco@lists.linux.dev,
Robin Murphy, Marek Szyprowski, Will Deacon, Marc Zyngier,
Steven Price, Suzuki K Poulose, Catalin Marinas, Jiri Pirko,
Mostafa Saleh, Petr Tesarik, Alexey Kardashevskiy, Dan Williams,
Xu Yilun, linuxppc-dev@lists.ozlabs.org,
linux-s390@vger.kernel.org, Madhavan Srinivasan, Michael Ellerman,
Nicholas Piggin, Christophe Leroy (CS GROUP), Alexander Gordeev,
Gerald Schaefer, Heiko Carstens, Vasily Gorbik,
Christian Borntraeger, Sven Schnelle, x86@kernel.org, Jiri Pirko
In-Reply-To: <yq5apl26qrof.fsf@kernel.org>
On Thu, Jun 04, 2026 at 08:27:36PM +0530, Aneesh Kumar K.V wrote:
> I already sent a v6 in the hope of getting this merged for the next
> merge window. Should I send a v7, or would you prefer that I do the
> rename on top of v6?
I think it is too late for such a major change, but this should be
imaginged to be for rc2ish next cycle. You also have to spell out how
the pkvm patch will get sequenced as well, it would be best to push
that it gets picked up right away.
Jason
^ permalink raw reply
* Re: [PATCH v3 0/6] Remove remaining references to the pktcdvd driver
From: Catalin Iacob @ 2026-06-04 20:59 UTC (permalink / raw)
To: Bart Van Assche
Cc: Thomas Bogendoerfer, Madhavan Srinivasan, Michael Ellerman,
Nicholas Piggin, Christophe Leroy (CS GROUP), Rich Felker,
John Paul Adrian Glaubitz, David S. Miller, Andreas Larsson,
James E.J. Bottomley, Martin K. Petersen, Jens Axboe,
Yoshinori Sato, linux-mips, linux-kernel, linuxppc-dev, linux-sh,
sparclinux, linux-scsi
In-Reply-To: <96a48e65-6ee2-4b81-91f8-821a2de707db@acm.org>
On Thu, Jun 04, 2026 at 08:50:22AM -0700, Bart Van Assche wrote:
> If this series is reposted, please combine patches 1/6 and 2/6. Anyway,
> this series looks good to me.
Thanks for taking a look.
I will repost in a few days when back to a computer to fix the subject
of patch 2 since you are right, it does wrongly suggest that the
implementation is moved.
For learning purposes, do you mind quickly sharing your rationale for
combining 1 and 2? To me they are in the same area but still separate
things: one removes dead code, the other increases encapsulation. Each
is a small but worthwhile cleanup, they can be applied in any order and
each can be applied without the other. It's true that, while patch 2
without patch 1 works, it would increase confusion but I find that an
argument to order 2 after 1, not to say they are one thing.
I'd also find it hard to describe what that combined patch does except
by calling out two things: "remove dead code and move the declaration".
Your suggested "unexport" seems more confusing than clarifying, I'd
argue that today the symbol is not exported since that ifdef always
compiles the export out.
^ permalink raw reply
* Re: [PATCH v3 0/6] Remove remaining references to the pktcdvd driver
From: Bart Van Assche @ 2026-06-04 21:17 UTC (permalink / raw)
To: Catalin Iacob
Cc: Thomas Bogendoerfer, Madhavan Srinivasan, Michael Ellerman,
Nicholas Piggin, Christophe Leroy (CS GROUP), Rich Felker,
John Paul Adrian Glaubitz, David S. Miller, Andreas Larsson,
James E.J. Bottomley, Martin K. Petersen, Jens Axboe,
Yoshinori Sato, linux-mips, linux-kernel, linuxppc-dev, linux-sh,
sparclinux, linux-scsi
In-Reply-To: <aiHbj0zn4y9X6qV2@toolbx>
On 6/4/26 1:59 PM, Catalin Iacob wrote:
> For learning purposes, do you mind quickly sharing your rationale for
> combining 1 and 2? To me they are in the same area but still separate
> things: one removes dead code, the other increases encapsulation. Each
> is a small but worthwhile cleanup, they can be applied in any order and
> each can be applied without the other. It's true that, while patch 2
> without patch 1 works, it would increase confusion but I find that an
> argument to order 2 after 1, not to say they are one thing.
>
> I'd also find it hard to describe what that combined patch does except
> by calling out two things: "remove dead code and move the declaration".
> Your suggested "unexport" seems more confusing than clarifying, I'd
> argue that today the symbol is not exported since that ifdef always
> compiles the export out.
"Unexport" is commonly used terminology in the Linux kernel for removing
EXPORT_SYMBOL*().
I proposed to combine patches 1 and 2 because it is normal that the
declaration of a function is moved from a public to a private header
file if it is unexported.
If patches 1 and 2 are combined, the most important change is removal
of EXPORT_SYMBOL_GPL(). So I think that "Unexport
scsi_device_from_queue()" would still be an appropriate description for
such a patch.
Thanks,
Bart.
^ permalink raw reply
* Re: [PATCH] perf data convert json: Fix trace_seq memory leak in process_sample_event()
From: Arnaldo Carvalho de Melo @ 2026-06-04 21:29 UTC (permalink / raw)
To: Tanushree Shah
Cc: jolsa, adrian.hunter, vmolnaro, mpetlan, tmricht, maddy, irogers,
namhyung, linux-perf-users, linuxppc-dev, atrajeev, hbathini,
Tejas.Manhas1, Tanushree.Shah, Shivani.Nittor
In-Reply-To: <20260604065519.112101-1-tshah@linux.ibm.com>
On Thu, Jun 04, 2026 at 12:25:19PM +0530, Tanushree Shah wrote:
> Unlike the in-kernel trace_seq which uses a statically allocated buffer,
> the userspace traceevent library's trace_seq uses a dynamically allocated
> one. Therefore, every trace_seq_init() call must be paired with a
> trace_seq_destroy(), otherwise it produces a memory leak.
>
> In process_sample_event(), a trace_seq is initialized for each field when
> formatting tracepoint raw_data, but the matching trace_seq_destroy() is
> never called, leaking memory for every field of every sample processed.
>
> Add the missing trace_seq_destroy() after using the trace_seq buffer to
> properly free the allocated memory.
>
> Detected with Valgrind on a perf.data file with 2,729 tracepoint samples:
> Before: definitely lost: 55,537,664 bytes in 13,559 blocks
> After: definitely lost: 0 bytes in 0 blocks
Thanks, applied to perf-tools-next, for v7.2.
- Arnaldo
> Fixes: 9d895e468429 ("perf data: Add tracepoint fields when converting to JSON")
> Signed-off-by: Tanushree Shah <tshah@linux.ibm.com>
> ---
> tools/perf/util/data-convert-json.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/tools/perf/util/data-convert-json.c b/tools/perf/util/data-convert-json.c
> index d526c91312ed..6dd9349ab10f 100644
> --- a/tools/perf/util/data-convert-json.c
> +++ b/tools/perf/util/data-convert-json.c
> @@ -257,6 +257,7 @@ static int process_sample_event(const struct perf_tool *tool,
> trace_seq_init(&s);
> tep_print_field(&s, sample->raw_data, fields[i]);
> output_json_key_string(out, true, 3, fields[i]->name, s.buffer);
> + trace_seq_destroy(&s);
>
> i++;
> }
> --
> 2.47.3
^ permalink raw reply
* Re: [PATCH V4 1/2] tools/perf: Fix the check for parameterized field in event term
From: Namhyung Kim @ 2026-06-04 21:52 UTC (permalink / raw)
To: Athira Rajeev
Cc: acme, jolsa, adrian.hunter, mpetlan, tmricht, maddy, irogers,
linux-perf-users, linuxppc-dev, hbathini, Tejas.Manhas1,
Tanushree.Shah, shivani
In-Reply-To: <20260504154205.21394-1-atrajeev@linux.ibm.com>
On Mon, May 04, 2026 at 09:12:04PM +0530, Athira Rajeev wrote:
> The format_alias() function in util/pmu.c has a check to
> detect whether the event has parameterized field ( =? ).
> The string alias->terms contains the event and if the event
> has user configurable parameter, there will be presence of
> sub string "=?" in the alias->terms.
>
> Snippet of code:
>
> /* Paramemterized events have the parameters shown. */
> if (strstr(alias->terms, "=?")) {
> /* No parameters. */
> snprintf(buf, len, "%.*s/%s/", (int)pmu_name_len, pmu->name, alias->name);
>
> if "strstr" contains the substring, it returns a pointer
> and hence enters the above check which is not the expected
> check. And hence "perf list" doesn't have the parameterized
> fields in the result.
>
> Fix this check to use:
>
> if (!strstr(alias->terms, "=?")) {
>
> With this change, perf list shows the events correctly with
> the strings showing parameters.
>
> Before the fix:
>
> # ./perf list|grep -w PM_PAU_CYC
> hv_24x7/PM_PAU_CYC/ [Kernel PMU event]
>
> With this fix:
>
> # ./perf list|grep -w PM_PAU_CYC
> hv_24x7/PM_PAU_CYC,chip=?/ [Kernel PMU event]
>
> Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Thanks,
Namhyung
> ---
> Changelog:
> v3 -> v4:
> Updated commit message to show real example
> addressing review comment from Namhyung.
>
> v2 -> v3:
> Split the strstr correction in a single patch
>
> tools/perf/util/pmu.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
> index 23337d2fa281..0b8d58543f17 100644
> --- a/tools/perf/util/pmu.c
> +++ b/tools/perf/util/pmu.c
> @@ -2117,7 +2117,7 @@ static char *format_alias(char *buf, int len, const struct perf_pmu *pmu,
> skip_duplicate_pmus);
>
> /* Paramemterized events have the parameters shown. */
> - if (strstr(alias->terms, "=?")) {
> + if (!strstr(alias->terms, "=?")) {
> /* No parameters. */
> snprintf(buf, len, "%.*s/%s/", (int)pmu_name_len, pmu->name, alias->name);
> return buf;
> --
> 2.47.3
>
^ permalink raw reply
* Re: [PATCH V4 2/2] tools/perf: Use scnprintf in buffer offset calculations
From: Namhyung Kim @ 2026-06-04 22:02 UTC (permalink / raw)
To: Athira Rajeev
Cc: acme, jolsa, adrian.hunter, mpetlan, tmricht, maddy, irogers,
linux-perf-users, linuxppc-dev, hbathini, Tejas.Manhas1,
Tanushree.Shah, shivani
In-Reply-To: <20260504154205.21394-2-atrajeev@linux.ibm.com>
On Mon, May 04, 2026 at 09:12:05PM +0530, Athira Rajeev wrote:
> Replace snprintf with scnprintf in buffer offset calculations to
> ensure the 'used' count will not exceed the "len".
>
> The current logic in perf_pmu__for_each_event uses an unconditional
> + 1 increment to buf_used to account for null terminators. This can
> cause a a stack buffer overflow in the subsequent scnprintf call.
> When the local stack buffer buf (1024 bytes) is full, buf_used can
> reach 1025. This causes the subsequent remaining space calculation
> sizeof(buf) - buf_used to underflow.
>
> Use sub_non_neg() to see if space actually existed, and only
> increment the offset if remaning space is present.
>
> Changes includes:
> - Use sub_non_neg to check if space exists
> - Replacing snprintf with scnprintf to ensure the return value
> reflects the actual bytes written into the buffer.
> - Only increment buf_used by 1 if space exists
> - If a parameterized event uses a built-in perf keyword for its
> parameter name (eg, config=?), the lexer parses it as a predefined
> term token, which sets term->config to NULL. Add check to use
> parse_events__term_type_str() if term->config is NULL.
>
> Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com>
> ---
> Changelog:
> v2 -> v3:
> - Split the scnprintf related changes in separate patch
> - Handle the overflow issues and unconditional increment
> wrapped around sub_non_neg addressing review comment from Sashiko
>
> tools/perf/util/pmu.c | 46 ++++++++++++++++++++++++++++++++-----------
> 1 file changed, 35 insertions(+), 11 deletions(-)
>
> diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
> index 0b8d58543f17..4b9ade1a4cf9 100644
> --- a/tools/perf/util/pmu.c
> +++ b/tools/perf/util/pmu.c
> @@ -2129,15 +2129,19 @@ static char *format_alias(char *buf, int len, const struct perf_pmu *pmu,
> pr_err("Failure to parse '%s' terms '%s': %d\n",
> alias->name, alias->terms, ret);
> parse_events_terms__exit(&terms);
> - snprintf(buf, len, "%.*s/%s/", (int)pmu_name_len, pmu->name, alias->name);
> + scnprintf(buf, len, "%.*s/%s/", (int)pmu_name_len, pmu->name, alias->name);
> return buf;
> }
> - used = snprintf(buf, len, "%.*s/%s", (int)pmu_name_len, pmu->name, alias->name);
> + used = scnprintf(buf, len, "%.*s/%s", (int)pmu_name_len, pmu->name, alias->name);
>
> list_for_each_entry(term, &terms.terms, list) {
> + const char *name = term->config;
> +
> + if (!name)
> + name = parse_events__term_type_str(term->type_term);
> if (term->type_val == PARSE_EVENTS__TERM_TYPE_STR)
> - used += snprintf(buf + used, sub_non_neg(len, used),
> - ",%s=%s", term->config,
> + used += scnprintf(buf + used, sub_non_neg(len, used),
> + ",%s=%s", name,
> term->val.str);
> }
> parse_events_terms__exit(&terms);
> @@ -2201,6 +2205,7 @@ int perf_pmu__for_each_event(struct perf_pmu *pmu, bool skip_duplicate_pmus,
> int ret = 0;
> struct hashmap_entry *entry;
> size_t bkt;
> + size_t size_rem, len;
>
> if (perf_pmu__is_tracepoint(pmu))
> return tp_pmu__for_each_event(pmu, state, cb);
> @@ -2234,17 +2239,36 @@ int perf_pmu__for_each_event(struct perf_pmu *pmu, bool skip_duplicate_pmus,
> }
> buf_used = strlen(buf) + 1;
> }
> +
> info.scale_unit = NULL;
> if (strlen(event->unit) || event->scale != 1.0) {
> - info.scale_unit = buf + buf_used;
> - buf_used += snprintf(buf + buf_used, sizeof(buf) - buf_used,
> - "%G%s", event->scale, event->unit) + 1;
> + /* Check the remaining space */
> + size_rem = sub_non_neg(sizeof(buf), buf_used);
> +
> + if (size_rem > 0) {
> + info.scale_unit = buf + buf_used;
> + len = scnprintf(buf + buf_used, size_rem, "%G%s",
> + event->scale, event->unit);
> + /*
> + * Increment buf_used by 1 only if
> + * it fits remaining space
> + */
> + buf_used += min(len + 1, size_rem);
Hmm.. it seems scnprintf() cannot return a number greater than or equal
to size_rem. Can we just do like this?
buf_used += scnprintf(...) + 1;
Thanks,
Namhyung
> + }
> }
> info.desc = event->desc;
> info.long_desc = event->long_desc;
> - info.encoding_desc = buf + buf_used;
> - buf_used += snprintf(buf + buf_used, sizeof(buf) - buf_used,
> - "%.*s/%s/", (int)pmu_name_len, info.pmu_name, event->terms) + 1;
> + info.encoding_desc = NULL;
> +
> + /* Check the remaining space */
> + size_rem = sub_non_neg(sizeof(buf), buf_used);
> + if (size_rem > 0) {
> + info.encoding_desc = buf + buf_used;
> + len = scnprintf(buf + buf_used, size_rem, "%.*s/%s/",
> + (int)pmu_name_len, info.pmu_name, event->terms);
> + buf_used += min(len + 1, size_rem);
> + }
> +
> info.str = event->terms;
> info.topic = event->topic;
> info.deprecated = perf_pmu_alias__check_deprecated(pmu, event);
> @@ -2254,7 +2278,7 @@ int perf_pmu__for_each_event(struct perf_pmu *pmu, bool skip_duplicate_pmus,
> }
> if (pmu->selectable) {
> info.name = buf;
> - snprintf(buf, sizeof(buf), "%s//", pmu->name);
> + scnprintf(buf, sizeof(buf), "%s//", pmu->name);
> info.alias = NULL;
> info.scale_unit = NULL;
> info.desc = NULL;
> --
> 2.47.3
>
^ permalink raw reply
* Re: [PATCH 11/23] i2c: pxa-pci: use platform_device_set_of_node()
From: Andi Shyti @ 2026-06-04 23:51 UTC (permalink / raw)
To: Bartosz Golaszewski
Cc: Lee Jones, Mark Brown, Thierry Reding, Sebastian Hesselbarth,
Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Srinivas Kandagatla, Greg Kroah-Hartman, Vinod Koul,
Rafael J. Wysocki, Danilo Krummrich, Rob Herring, Saravana Kannan,
Madhavan Srinivasan, Michael Ellerman, Nicholas Piggin,
Christophe Leroy (CS GROUP), Andy Shevchenko, Joerg Roedel,
Will Deacon, Robin Murphy, Doug Berger, Florian Fainelli,
Broadcom internal kernel review list, Ulf Hansson, Frank Li,
Sascha Hauer, Pengutronix Kernel Team, Fabio Estevam,
Matthew Brost, Thomas Hellström, Rodrigo Vivi, David Airlie,
Simona Vetter, Peter Chen, Paul Cercueil, Bin Liu, Philipp Zabel,
Maximilian Luz, Hans de Goede, Ilpo Järvinen,
Krzysztof Kozlowski, Benjamin Herrenschmidt, brgl, linux-kernel,
netdev, linux-arm-msm, linux-sound, driver-core, devicetree,
linuxppc-dev, linux-i2c, iommu, linux-pm, imx, linux-arm-kernel,
intel-xe, dri-devel, linux-usb, linux-mips, platform-driver-x86
In-Reply-To: <20260521-pdev-fwnode-ref-v1-11-88c324a1b8d2@oss.qualcomm.com>
Hi Bartosz,
On Thu, May 21, 2026 at 10:36:34AM +0200, Bartosz Golaszewski wrote:
> Ahead of reworking the reference counting logic for platform devices,
> encapsulate the assignment of the OF node for dynamically allocated
> platform devices with the provided helper.
>
> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Acked-by: Andi Shyti <andi.shyti@kernel.org>
Thanks,
Andi
^ permalink raw reply
* Re: [RFC 00/12] PCI: Add support for Scalable I/O Virtualization
From: Jason Gunthorpe @ 2026-06-04 23:53 UTC (permalink / raw)
To: Dimitri Daskalakis
Cc: Bjorn Helgaas, linux-pci, Madhavan Srinivasan, Michael Ellerman,
Nicholas Piggin, Christophe Leroy, Mahesh J Salgaonkar,
Oliver O'Halloran, Niklas Schnelle, Gerald Schaefer,
Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
Christian Borntraeger, Sven Schnelle, Alex Williamson, Kevin Tian,
Ankit Agrawal, Leon Romanovsky, Juergen Gross, Stefano Stabellini,
Oleksandr Tyshchenko, Keith Busch, Alexander Duyck,
Jakub Kicinski, Dimitri Daskalakis, linuxppc-dev, linux-s390, kvm,
xen-devel
In-Reply-To: <a07763e2-aff3-4468-83c2-d50a9e1605c8@gmail.com>
On Thu, Jun 04, 2026 at 04:49:05PM -0700, Dimitri Daskalakis wrote:
>
> On 6/4/26 11:20 AM, Jason Gunthorpe wrote:
> > On Thu, Jun 04, 2026 at 08:01:41AM -0700, Dimitri Daskalakis wrote:
> >> With this patchset core enumarates the SIOV capability and can identify
> >> SIOV PFs. But there is no central mechanism to allocate/manage SIOV VFs.
> >> To support device pass through, devices will need to add a vfio-mdev
> >> driver with IOMMUFD support (or something similar).
> >
> > There is an enormous amount of missing work to do something useful
> > with the SIOVr2 stuff. IIRC there is even supposed to be BIOS
> > components in this plan and there are some missing PCI SIG topics too
> > IIRC.
> >
> > So, I'm not sure how much value there is in merging just the cap
> > discovery without a roadmap for the missing parts..
> >
> > Also, I'm quite surprised to see this out of the blue, there is an OCP
> > workstream that was building out a standard that outlines how all the
> > different components have to act to successfully implement it. What
> > is in PCI SIG was just some minor foundational adjustments without any
> > context on how to form them into a solution.
> >
> > I think it is extremely premature to merge anything related to SIOV to
> > the kernel. Join the OCP work stream if you are interested. I think
> > the general feeling was there is not sufficient interest in the
> > industry to do this and it has gone quiet.
> >
> > Jason
>
> Hey Jason, thanks for the feedback. We (at Meta) are definitely
> interested in SIOV-like capabilities for device passthrough to containers.
>
> For those scenarios, having PCIe transactions per RID plus IOMMU
> isolation is enough, but I can imagine hypervisors/VMs requiring more
> platform support.
>
> I hear you on the broader support story being premature. But on the
> other hand, this series unblocks experimentation at the driver level
> for basic data path validation.
And you can do your experimentation OOT :)
Go to the OCP workstream and finish the standard if you want to get
something merged.
There is zero need for SIOV if all you want is containers. containers
do not require iommu isolation or unique RIDs.
Jason
^ permalink raw reply
* Re: [RFC 00/12] PCI: Add support for Scalable I/O Virtualization
From: Jakub Kicinski @ 2026-06-05 0:59 UTC (permalink / raw)
To: Dimitri Daskalakis
Cc: Jason Gunthorpe, Bjorn Helgaas, linux-pci, Madhavan Srinivasan,
Michael Ellerman, Nicholas Piggin, Christophe Leroy,
Mahesh J Salgaonkar, Oliver O'Halloran, Niklas Schnelle,
Gerald Schaefer, Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
Christian Borntraeger, Sven Schnelle, Alex Williamson, Kevin Tian,
Ankit Agrawal, Leon Romanovsky, Juergen Gross, Stefano Stabellini,
Oleksandr Tyshchenko, Keith Busch, Alexander Duyck,
Dimitri Daskalakis, linuxppc-dev, linux-s390, kvm, xen-devel
In-Reply-To: <a07763e2-aff3-4468-83c2-d50a9e1605c8@gmail.com>
On Thu, 4 Jun 2026 16:49:05 -0700 Dimitri Daskalakis wrote:
> Hey Jason, thanks for the feedback. We (at Meta) are definitely
> interested in SIOV-like capabilities for device passthrough to containers.
I think "definitely interested" is quite an exaggeration.
More importantly, which big company is interested in what
tech should be irrelevant upstream. Please don't make these
sort of arguments. The code has to stand on its own.
^ permalink raw reply
* [PATCH v3 net-next 0/9] net: dsa: netc: add bridge mode support
From: wei.fang @ 2026-06-05 1:47 UTC (permalink / raw)
To: claudiu.manoil, vladimir.oltean, xiaoning.wang, andrew+netdev,
davem, edumazet, kuba, pabeni, chleroy, andrew, olteanv, linux
Cc: wei.fang, imx, netdev, linux-kernel, linuxppc-dev,
linux-arm-kernel
From: Wei Fang <wei.fang@nxp.com>
This series adds bridge mode support to the NETC DSA switch driver,
covering both VLAN-aware and VLAN-unaware operation.
The NETC switch manages forwarding through a set of hardware tables
accessed via NTMP: the FDB table (FDBT), VLAN filter table (VFT), egress
treatment table (ETT), and egress count table (ECT). The series extends
the NTMP layer with the operations required for bridging, then builds the
DSA bridge callbacks on top.
Since all switch ports share the VFT, so only one VLAN-aware bridge is
supported.
FDB aging is managed in software. A periodic delayed work sweeps the
table using the hardware activity element mechanism, with a default aging
time of 300 seconds matching the IEEE 802.1Q standard. Per-port entries
are also flushed immediately on bridge leave and link-down events.
---
v3:
1. Update the kernel doc of ntmp_fdbt_update_activity_element()
2. Remove struct ect_stse_data from ntmp.h
3. Refactor netc_vlan_unaware_pvid()
4. Refine netc_port_remove_host_flood()
5. Add bridge.num check in netc_port_bridge_join() and add an error path
6. Remove the parameter ett_eid from netc_init_ett_cfge()
7. Move the implementation of .port_fast_age() from patch 9 to patch 8
8. Refine netc_port_update_vlan_egress_rule(), reset ECT entries after
updating ETT entries
9. Update commit messages of patch 8 and 9, and improve some comments
v2 link: https://lore.kernel.org/imx/20260602072313.3162120-1-wei.fang@oss.nxp.com/
v1 link: https://lore.kernel.org/imx/20260527100217.794987-1-wei.fang@oss.nxp.com/
---
Wei Fang (9):
net: enetc: add interfaces to manage dynamic FDB entries
net: enetc: add "Update" and "Delete" operations to VLAN filter table
net: enetc: add interfaces to manage egress treatment table
net: enetc: add "Update" operation to the egress count table
net: dsa: netc: initialize the group bitmap of ETT and ECT
net: enetc: add helpers to set/clear table bitmap
net: dsa: netc: add VLAN filter table and egress treatment management
net: dsa: netc: add bridge mode support
net: dsa: netc: implement dynamic FDB entry ageing
drivers/net/dsa/netc/netc_main.c | 940 +++++++++++++++++-
drivers/net/dsa/netc/netc_switch.h | 33 +
drivers/net/dsa/netc/netc_switch_hw.h | 6 +
drivers/net/ethernet/freescale/enetc/ntmp.c | 440 +++++++-
.../ethernet/freescale/enetc/ntmp_private.h | 18 +-
include/linux/fsl/ntmp.h | 55 +
6 files changed, 1470 insertions(+), 22 deletions(-)
--
2.34.1
^ permalink raw reply
* [PATCH v3 net-next 1/9] net: enetc: add interfaces to manage dynamic FDB entries
From: wei.fang @ 2026-06-05 1:48 UTC (permalink / raw)
To: claudiu.manoil, vladimir.oltean, xiaoning.wang, andrew+netdev,
davem, edumazet, kuba, pabeni, chleroy, andrew, olteanv, linux
Cc: wei.fang, imx, netdev, linux-kernel, linuxppc-dev,
linux-arm-kernel
In-Reply-To: <20260605014808.686024-1-wei.fang@oss.nxp.com>
From: Wei Fang <wei.fang@nxp.com>
Add three interfaces to manage dynamic entries in the FDB table:
ntmp_fdbt_update_activity_element(): Update the activity element of all
dynamic FDB entries. For each entry, if its activity flag is not set,
which means no packet has matched this entry since the last update, the
activity counter is incremented. Otherwise, both the activity flag and
activity counter are reset. The activity counter is used to track how
long an FDB entry has been inactive, which is useful for implementing
an ageing mechanism.
ntmp_fdbt_delete_ageing_entries(): Delete all dynamic FDB entries whose
activity flag is not set and whose activity counter is greater than or
equal to the specified threshold. This is used to remove stale entries
that have been inactive for too long.
ntmp_fdbt_delete_port_dynamic_entries(): Delete all dynamic FDB entries
associated with the specified switch port. This is typically called when
a port goes down or is removed from a bridge.
Signed-off-by: Wei Fang <wei.fang@nxp.com>
---
drivers/net/ethernet/freescale/enetc/ntmp.c | 162 ++++++++++++++++++
.../ethernet/freescale/enetc/ntmp_private.h | 4 +-
include/linux/fsl/ntmp.h | 3 +
3 files changed, 167 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/freescale/enetc/ntmp.c b/drivers/net/ethernet/freescale/enetc/ntmp.c
index f71cad943424..bda26fe93b8d 100644
--- a/drivers/net/ethernet/freescale/enetc/ntmp.c
+++ b/drivers/net/ethernet/freescale/enetc/ntmp.c
@@ -31,6 +31,7 @@
#define NTMP_GEN_UA_STSEU BIT(1)
/* Specific Update Actions for some tables */
+#define FDBT_UA_ACTEU BIT(1)
#define BPT_UA_BPSEU BIT(1)
/* Query Action: 0: Full query. 1: Query entry ID, the fields after entry
@@ -793,6 +794,167 @@ int ntmp_fdbt_search_port_entry(struct ntmp_user *user, int port,
}
EXPORT_SYMBOL_GPL(ntmp_fdbt_search_port_entry);
+/**
+ * ntmp_fdbt_update_activity_element - update the activity element of all
+ * the dynamic entries in the FDB table.
+ * @user: target ntmp_user struct
+ *
+ * A single activity update management could be used to process all the
+ * dynamic entries in the FDB table. When hardware process an activity
+ * update management command for an entry in the FDB table and the entry
+ * does not have its activity flag set, the activity counter is incremented.
+ * However, if the activity flag is set, then both the activity flag and
+ * activity counter are reset. Software can issue the activity update
+ * management commands at predefined times and the value of the activity
+ * counter can then be used to estimate the period of how long an FDB
+ * entry has been inactive.
+ *
+ * Return: 0 on success, otherwise a negative error code
+ */
+int ntmp_fdbt_update_activity_element(struct ntmp_user *user)
+{
+ struct fdbt_req_ua *req;
+ struct netc_swcbd swcbd;
+ struct netc_cbdr *cbdr;
+ union netc_cbd cbd;
+ u32 len;
+ int err;
+
+ swcbd.size = sizeof(*req);
+ err = ntmp_alloc_data_mem(user->dev, &swcbd, (void **)&req);
+ if (err)
+ return err;
+
+ /* Request data */
+ ntmp_fill_crd(&req->crd, user->tbl.fdbt_ver, 0, FDBT_UA_ACTEU);
+ req->ak.search.resume_eid = cpu_to_le32(NTMP_NULL_ENTRY_ID);
+ req->ak.search.cfge.cfg = cpu_to_le32(FDBT_DYNAMIC);
+ req->ak.search.cfge_mc = FDBT_CFGE_MC_DYNAMIC;
+
+ /* Request header */
+ len = NTMP_LEN(swcbd.size, NTMP_STATUS_RESP_LEN);
+ /* For activity update, the access method must be search */
+ ntmp_fill_request_hdr(&cbd, swcbd.dma, len, NTMP_FDBT_ID,
+ NTMP_CMD_UPDATE, NTMP_AM_SEARCH);
+
+ ntmp_select_and_lock_cbdr(user, &cbdr);
+ err = netc_xmit_ntmp_cmd(cbdr, &cbd, &swcbd);
+ if (err)
+ dev_err(user->dev,
+ "Failed to update activity of %s, err: %pe\n",
+ ntmp_table_name(NTMP_FDBT_ID), ERR_PTR(err));
+
+ ntmp_unlock_cbdr(cbdr);
+
+ return err;
+}
+EXPORT_SYMBOL_GPL(ntmp_fdbt_update_activity_element);
+
+/**
+ * ntmp_fdbt_delete_ageing_entries - delete all the ageing dynamic entries
+ * in the FDB table
+ * @user: target ntmp_user struct
+ * @act_cnt: the target value of the activity counter
+ *
+ * The matching rule is that the activity flag is not set and the activity
+ * counter is greater than or equal to act_cnt
+ *
+ * Return: 0 on success, otherwise a negative error code
+ */
+int ntmp_fdbt_delete_ageing_entries(struct ntmp_user *user, u8 act_cnt)
+{
+ struct fdbt_req_qd *req;
+ struct netc_swcbd swcbd;
+ struct netc_cbdr *cbdr;
+ union netc_cbd cbd;
+ u32 len;
+ int err;
+
+ if (act_cnt > FDBT_ACT_CNT)
+ return -EINVAL;
+
+ swcbd.size = sizeof(*req);
+ err = ntmp_alloc_data_mem(user->dev, &swcbd, (void **)&req);
+ if (err)
+ return err;
+
+ /* Request data */
+ ntmp_fill_crd(&req->crd, user->tbl.fdbt_ver, 0, 0);
+ req->ak.search.resume_eid = cpu_to_le32(NTMP_NULL_ENTRY_ID);
+ req->ak.search.cfge.cfg = cpu_to_le32(FDBT_DYNAMIC);
+ req->ak.search.acte = act_cnt;
+ /* Exact match with ACTE_DATA[ACT_FLAG] AND
+ * match >= ACTE_DATA[ACT_CNT]
+ */
+ req->ak.search.acte_mc = FDBT_ACTE_MC;
+ req->ak.search.cfge_mc = FDBT_CFGE_MC_DYNAMIC;
+
+ /* Request header */
+ len = NTMP_LEN(swcbd.size, NTMP_STATUS_RESP_LEN);
+ ntmp_fill_request_hdr(&cbd, swcbd.dma, len, NTMP_FDBT_ID,
+ NTMP_CMD_DELETE, NTMP_AM_SEARCH);
+
+ ntmp_select_and_lock_cbdr(user, &cbdr);
+ err = netc_xmit_ntmp_cmd(cbdr, &cbd, &swcbd);
+ if (err)
+ dev_err(user->dev,
+ "Failed to delete ageing entries of %s, err: %pe\n",
+ ntmp_table_name(NTMP_FDBT_ID), ERR_PTR(err));
+
+ ntmp_unlock_cbdr(cbdr);
+
+ return err;
+}
+EXPORT_SYMBOL_GPL(ntmp_fdbt_delete_ageing_entries);
+
+/**
+ * ntmp_fdbt_delete_port_dynamic_entries - delete all dynamic FDB entries
+ * associated with the specified switch port
+ * @user: target ntmp_user struct
+ * @port: the specified switch port ID
+ *
+ * Return: 0 on success, otherwise a negative error code
+ */
+int ntmp_fdbt_delete_port_dynamic_entries(struct ntmp_user *user, int port)
+{
+ struct fdbt_req_qd *req;
+ struct netc_swcbd swcbd;
+ struct netc_cbdr *cbdr;
+ union netc_cbd cbd;
+ u32 len;
+ int err;
+
+ swcbd.size = sizeof(*req);
+ err = ntmp_alloc_data_mem(user->dev, &swcbd, (void **)&req);
+ if (err)
+ return err;
+
+ /* Request data */
+ ntmp_fill_crd(&req->crd, user->tbl.fdbt_ver, 0, 0);
+ req->ak.search.resume_eid = cpu_to_le32(NTMP_NULL_ENTRY_ID);
+ req->ak.search.cfge.port_bitmap = cpu_to_le32(BIT(port));
+ req->ak.search.cfge.cfg = cpu_to_le32(FDBT_DYNAMIC);
+ /* Match CFGE_DATA[DYNAMIC & PORT_BITMAP] field */
+ req->ak.search.cfge_mc = FDBT_CFGE_MC_DYNAMIC_AND_PORT_BITMAP;
+
+ /* Request header */
+ len = NTMP_LEN(swcbd.size, NTMP_STATUS_RESP_LEN);
+ ntmp_fill_request_hdr(&cbd, swcbd.dma, len, NTMP_FDBT_ID,
+ NTMP_CMD_DELETE, NTMP_AM_SEARCH);
+
+ ntmp_select_and_lock_cbdr(user, &cbdr);
+ err = netc_xmit_ntmp_cmd(cbdr, &cbd, &swcbd);
+ if (err)
+ dev_err(user->dev,
+ "Failed to delete dynamic %s entries on port %d, err: %pe\n",
+ ntmp_table_name(NTMP_FDBT_ID), port, ERR_PTR(err));
+
+ ntmp_unlock_cbdr(cbdr);
+
+ return err;
+}
+EXPORT_SYMBOL_GPL(ntmp_fdbt_delete_port_dynamic_entries);
+
/**
* ntmp_vft_add_entry - add an entry into the VLAN filter table
* @user: target ntmp_user struct
diff --git a/drivers/net/ethernet/freescale/enetc/ntmp_private.h b/drivers/net/ethernet/freescale/enetc/ntmp_private.h
index 0a9b87286105..ad532b059ba8 100644
--- a/drivers/net/ethernet/freescale/enetc/ntmp_private.h
+++ b/drivers/net/ethernet/freescale/enetc/ntmp_private.h
@@ -155,8 +155,8 @@ struct fdbt_ak_search {
#define FDBT_KEYE_MAC GENMASK(1, 0)
u8 cfge_mc;
#define FDBT_CFGE_MC GENMASK(2, 0)
-#define FDBT_CFGE_MC_ANY 0
-#define FDBT_CFGE_MC_DYNAMIC 1
+#define FDBT_CFGE_MC_ANY 0
+#define FDBT_CFGE_MC_DYNAMIC 1
#define FDBT_CFGE_MC_PORT_BITMAP 2
#define FDBT_CFGE_MC_DYNAMIC_AND_PORT_BITMAP 3
u8 acte_mc;
diff --git a/include/linux/fsl/ntmp.h b/include/linux/fsl/ntmp.h
index 88166f9ad3a2..5db078e1caa0 100644
--- a/include/linux/fsl/ntmp.h
+++ b/include/linux/fsl/ntmp.h
@@ -263,6 +263,9 @@ int ntmp_fdbt_delete_entry(struct ntmp_user *user, u32 entry_id);
int ntmp_fdbt_search_port_entry(struct ntmp_user *user, int port,
u32 *resume_entry_id,
struct fdbt_entry_data *entry);
+int ntmp_fdbt_update_activity_element(struct ntmp_user *user);
+int ntmp_fdbt_delete_ageing_entries(struct ntmp_user *user, u8 act_cnt);
+int ntmp_fdbt_delete_port_dynamic_entries(struct ntmp_user *user, int port);
int ntmp_vft_add_entry(struct ntmp_user *user, u16 vid,
const struct vft_cfge_data *cfge);
int ntmp_bpt_update_entry(struct ntmp_user *user, u32 entry_id,
--
2.34.1
^ permalink raw reply related
* [PATCH v3 net-next 2/9] net: enetc: add "Update" and "Delete" operations to VLAN filter table
From: wei.fang @ 2026-06-05 1:48 UTC (permalink / raw)
To: claudiu.manoil, vladimir.oltean, xiaoning.wang, andrew+netdev,
davem, edumazet, kuba, pabeni, chleroy, andrew, olteanv, linux
Cc: wei.fang, imx, netdev, linux-kernel, linuxppc-dev,
linux-arm-kernel
In-Reply-To: <20260605014808.686024-1-wei.fang@oss.nxp.com>
From: Wei Fang <wei.fang@nxp.com>
Add two interfaces to manage entries in the VLAN filter table:
ntmp_vft_update_entry(): Update the configuration element data of the
specified VLAN filter entry based on the given VLAN ID. It uses the
exact key access method to locate the entry.
ntmp_vft_delete_entry(): Delete the VLAN filter entry corresponding to
the specified VLAN ID. It also uses the exact key access method to
identify the target entry.
In addition, introduce struct vft_req_qd to describe the request data
buffer format for Query and Delete actions of the VLAN filter table,
which contains a common request data header and a VLAN access key.
Signed-off-by: Wei Fang <wei.fang@nxp.com>
---
drivers/net/ethernet/freescale/enetc/ntmp.c | 103 ++++++++++++++++--
.../ethernet/freescale/enetc/ntmp_private.h | 6 +
include/linux/fsl/ntmp.h | 3 +
3 files changed, 105 insertions(+), 7 deletions(-)
diff --git a/drivers/net/ethernet/freescale/enetc/ntmp.c b/drivers/net/ethernet/freescale/enetc/ntmp.c
index bda26fe93b8d..4e60bbc38cfa 100644
--- a/drivers/net/ethernet/freescale/enetc/ntmp.c
+++ b/drivers/net/ethernet/freescale/enetc/ntmp.c
@@ -956,15 +956,17 @@ int ntmp_fdbt_delete_port_dynamic_entries(struct ntmp_user *user, int port)
EXPORT_SYMBOL_GPL(ntmp_fdbt_delete_port_dynamic_entries);
/**
- * ntmp_vft_add_entry - add an entry into the VLAN filter table
+ * ntmp_vft_set_entry - add an entry into the VLAN filter table or update
+ * the configuration element data of the specified VLAN filter entry
* @user: target ntmp_user struct
* @vid: VLAN ID
+ * @cmd: command type, NTMP_CMD_ADD or NTMP_CMD_UPDATE
* @cfge: configuration element data
*
* Return: 0 on success, otherwise a negative error code
*/
-int ntmp_vft_add_entry(struct ntmp_user *user, u16 vid,
- const struct vft_cfge_data *cfge)
+static int ntmp_vft_set_entry(struct ntmp_user *user, u16 vid, int cmd,
+ const struct vft_cfge_data *cfge)
{
struct netc_swcbd swcbd;
struct vft_req_ua *req;
@@ -973,34 +975,121 @@ int ntmp_vft_add_entry(struct ntmp_user *user, u16 vid,
u32 len;
int err;
+ if (cmd != NTMP_CMD_ADD && cmd != NTMP_CMD_UPDATE)
+ return -EINVAL;
+
swcbd.size = sizeof(*req);
err = ntmp_alloc_data_mem(user->dev, &swcbd, (void **)&req);
if (err)
return err;
/* Request data */
- ntmp_fill_crd(&req->crd, user->tbl.vft_ver, 0,
- NTMP_GEN_UA_CFGEU);
+ ntmp_fill_crd(&req->crd, user->tbl.vft_ver, 0, NTMP_GEN_UA_CFGEU);
req->ak.exact.vid = cpu_to_le16(vid);
req->cfge = *cfge;
/* Request header */
len = NTMP_LEN(swcbd.size, NTMP_STATUS_RESP_LEN);
ntmp_fill_request_hdr(&cbd, swcbd.dma, len, NTMP_VFT_ID,
- NTMP_CMD_ADD, NTMP_AM_EXACT_KEY);
+ cmd, NTMP_AM_EXACT_KEY);
ntmp_select_and_lock_cbdr(user, &cbdr);
err = netc_xmit_ntmp_cmd(cbdr, &cbd, &swcbd);
+ ntmp_unlock_cbdr(cbdr);
+
+ return err;
+}
+
+/**
+ * ntmp_vft_add_entry - add an entry into the VLAN filter table
+ * @user: target ntmp_user struct
+ * @vid: VLAN ID
+ * @cfge: configuration element data
+ *
+ * Return: 0 on success, otherwise a negative error code
+ */
+int ntmp_vft_add_entry(struct ntmp_user *user, u16 vid,
+ const struct vft_cfge_data *cfge)
+{
+ int err;
+
+ err = ntmp_vft_set_entry(user, vid, NTMP_CMD_ADD, cfge);
if (err)
dev_err(user->dev,
"Failed to add %s entry, vid: %u, err: %pe\n",
ntmp_table_name(NTMP_VFT_ID), vid, ERR_PTR(err));
+ return err;
+}
+EXPORT_SYMBOL_GPL(ntmp_vft_add_entry);
+
+/**
+ * ntmp_vft_update_entry - update the configuration element data of the
+ * specified VLAN filter entry
+ * @user: target ntmp_user struct
+ * @vid: VLAN ID
+ * @cfge: configuration element data
+ *
+ * Return: 0 on success, otherwise a negative error code
+ */
+int ntmp_vft_update_entry(struct ntmp_user *user, u16 vid,
+ const struct vft_cfge_data *cfge)
+{
+ int err;
+
+ err = ntmp_vft_set_entry(user, vid, NTMP_CMD_UPDATE, cfge);
+ if (err)
+ dev_err(user->dev,
+ "Failed to update %s entry, vid: %u, err: %pe\n",
+ ntmp_table_name(NTMP_VFT_ID), vid, ERR_PTR(err));
+
+ return err;
+}
+EXPORT_SYMBOL_GPL(ntmp_vft_update_entry);
+
+/**
+ * ntmp_vft_delete_entry - delete the VLAN filter entry based on the
+ * specified VLAN ID
+ * @user: target ntmp_user struct
+ * @vid: VLAN ID
+ *
+ * Return: 0 on success, otherwise a negative error code
+ */
+int ntmp_vft_delete_entry(struct ntmp_user *user, u16 vid)
+{
+ struct netc_swcbd swcbd;
+ struct vft_req_qd *req;
+ struct netc_cbdr *cbdr;
+ union netc_cbd cbd;
+ u32 len;
+ int err;
+
+ swcbd.size = sizeof(*req);
+ err = ntmp_alloc_data_mem(user->dev, &swcbd, (void **)&req);
+ if (err)
+ return err;
+
+ /* Request data */
+ ntmp_fill_crd(&req->crd, user->tbl.vft_ver, 0, 0);
+ req->ak.exact.vid = cpu_to_le16(vid);
+
+ /* Request header */
+ len = NTMP_LEN(swcbd.size, NTMP_STATUS_RESP_LEN);
+ ntmp_fill_request_hdr(&cbd, swcbd.dma, len, NTMP_VFT_ID,
+ NTMP_CMD_DELETE, NTMP_AM_EXACT_KEY);
+
+ ntmp_select_and_lock_cbdr(user, &cbdr);
+ err = netc_xmit_ntmp_cmd(cbdr, &cbd, &swcbd);
+ if (err)
+ dev_err(user->dev,
+ "Failed to delete %s entry, vid: %u, err: %pe\n",
+ ntmp_table_name(NTMP_VFT_ID), vid, ERR_PTR(err));
+
ntmp_unlock_cbdr(cbdr);
return err;
}
-EXPORT_SYMBOL_GPL(ntmp_vft_add_entry);
+EXPORT_SYMBOL_GPL(ntmp_vft_delete_entry);
int ntmp_bpt_update_entry(struct ntmp_user *user, u32 entry_id,
const struct bpt_cfge_data *cfge)
diff --git a/drivers/net/ethernet/freescale/enetc/ntmp_private.h b/drivers/net/ethernet/freescale/enetc/ntmp_private.h
index ad532b059ba8..9d30f128849a 100644
--- a/drivers/net/ethernet/freescale/enetc/ntmp_private.h
+++ b/drivers/net/ethernet/freescale/enetc/ntmp_private.h
@@ -211,6 +211,12 @@ struct vft_req_ua {
struct vft_cfge_data cfge;
};
+/* VLAN Filter Table Request Data Buffer Format of Query and Delete actions */
+struct vft_req_qd {
+ struct ntmp_cmn_req_data crd;
+ union vft_access_key ak;
+};
+
/* Buffer Pool Table Request Data Buffer Format of Update action */
struct bpt_req_update {
struct ntmp_req_by_eid rbe;
diff --git a/include/linux/fsl/ntmp.h b/include/linux/fsl/ntmp.h
index 5db078e1caa0..36a9089526ad 100644
--- a/include/linux/fsl/ntmp.h
+++ b/include/linux/fsl/ntmp.h
@@ -268,6 +268,9 @@ int ntmp_fdbt_delete_ageing_entries(struct ntmp_user *user, u8 act_cnt);
int ntmp_fdbt_delete_port_dynamic_entries(struct ntmp_user *user, int port);
int ntmp_vft_add_entry(struct ntmp_user *user, u16 vid,
const struct vft_cfge_data *cfge);
+int ntmp_vft_update_entry(struct ntmp_user *user, u16 vid,
+ const struct vft_cfge_data *cfge);
+int ntmp_vft_delete_entry(struct ntmp_user *user, u16 vid);
int ntmp_bpt_update_entry(struct ntmp_user *user, u32 entry_id,
const struct bpt_cfge_data *cfge);
#else
--
2.34.1
^ permalink raw reply related
* [PATCH v3 net-next 3/9] net: enetc: add interfaces to manage egress treatment table
From: wei.fang @ 2026-06-05 1:48 UTC (permalink / raw)
To: claudiu.manoil, vladimir.oltean, xiaoning.wang, andrew+netdev,
davem, edumazet, kuba, pabeni, chleroy, andrew, olteanv, linux
Cc: wei.fang, imx, netdev, linux-kernel, linuxppc-dev,
linux-arm-kernel
In-Reply-To: <20260605014808.686024-1-wei.fang@oss.nxp.com>
From: Wei Fang <wei.fang@nxp.com>
Each entry in the egress treatment table contains the egress packet
processing actions to be applied to a grouping or scope of packets
exiting on a particular egress port of the switch. A scope of packets,
for example, could be the packets exiting a particular VLAN, matching
a particular 802.1Q bridge forwarding entry or belonging to a stream
identified at ingress. The egress treatment table is implemented as a
linear array of entries accessed using an index (0,1, 2, ..., n) that
uniquely identifies an entry within the array.
The egress treatment table only supports access vid entry ID, which is
assigned by the software. It supports Add, Update, Delete and Query
operations. Note that only Query operation is not supported yet.
Signed-off-by: Wei Fang <wei.fang@nxp.com>
---
drivers/net/ethernet/freescale/enetc/ntmp.c | 106 ++++++++++++++++++
.../ethernet/freescale/enetc/ntmp_private.h | 8 ++
include/linux/fsl/ntmp.h | 23 ++++
3 files changed, 137 insertions(+)
diff --git a/drivers/net/ethernet/freescale/enetc/ntmp.c b/drivers/net/ethernet/freescale/enetc/ntmp.c
index 4e60bbc38cfa..9249f78219ed 100644
--- a/drivers/net/ethernet/freescale/enetc/ntmp.c
+++ b/drivers/net/ethernet/freescale/enetc/ntmp.c
@@ -24,6 +24,7 @@
#define NTMP_IPFT_ID 13
#define NTMP_FDBT_ID 15
#define NTMP_VFT_ID 18
+#define NTMP_ETT_ID 33
#define NTMP_BPT_ID 41
/* Generic Update Actions for most tables */
@@ -284,6 +285,8 @@ static const char *ntmp_table_name(int tbl_id)
return "FDB Table";
case NTMP_VFT_ID:
return "VLAN Filter Table";
+ case NTMP_ETT_ID:
+ return "Egress Treatment Table";
case NTMP_BPT_ID:
return "Buffer Pool Table";
default:
@@ -1091,6 +1094,109 @@ int ntmp_vft_delete_entry(struct ntmp_user *user, u16 vid)
}
EXPORT_SYMBOL_GPL(ntmp_vft_delete_entry);
+/**
+ * ntmp_ett_set_entry - add a new entry to the egress treatment table or
+ * update the configuration element data of the specified entry
+ * @user: target ntmp_user struct
+ * @entry_id: entry ID
+ * @cmd: command type, NTMP_CMD_ADD or NTMP_CMD_UPDATE
+ * @cfge: configuration element data
+ *
+ * Return: 0 on success, otherwise a negative error code
+ */
+static int ntmp_ett_set_entry(struct ntmp_user *user, u32 entry_id,
+ int cmd, const struct ett_cfge_data *cfge)
+{
+ struct netc_swcbd swcbd;
+ struct ett_req_ua *req;
+ struct netc_cbdr *cbdr;
+ union netc_cbd cbd;
+ int err;
+
+ if (cmd != NTMP_CMD_ADD && cmd != NTMP_CMD_UPDATE)
+ return -EINVAL;
+
+ swcbd.size = sizeof(*req);
+ err = ntmp_alloc_data_mem(user->dev, &swcbd, (void **)&req);
+ if (err)
+ return err;
+
+ /* Request data */
+ ntmp_fill_crd_eid(&req->rbe, user->tbl.ett_ver, 0,
+ NTMP_GEN_UA_CFGEU, entry_id);
+ req->cfge = *cfge;
+
+ /* Request header */
+ ntmp_fill_request_hdr(&cbd, swcbd.dma, NTMP_LEN(swcbd.size, 0),
+ NTMP_ETT_ID, cmd, NTMP_AM_ENTRY_ID);
+
+ ntmp_select_and_lock_cbdr(user, &cbdr);
+ err = netc_xmit_ntmp_cmd(cbdr, &cbd, &swcbd);
+ ntmp_unlock_cbdr(cbdr);
+
+ return err;
+}
+
+/**
+ * ntmp_ett_add_entry - add a new entry to the egress treatment table
+ * @user: target ntmp_user struct
+ * @entry_id: entry ID
+ * @cfge: configuration element data
+ *
+ * Return: 0 on success, otherwise a negative error code
+ */
+int ntmp_ett_add_entry(struct ntmp_user *user, u32 entry_id,
+ const struct ett_cfge_data *cfge)
+{
+ int err;
+
+ err = ntmp_ett_set_entry(user, entry_id, NTMP_CMD_ADD, cfge);
+ if (err)
+ dev_err(user->dev, "Failed to add %s entry 0x%x, err: %pe\n",
+ ntmp_table_name(NTMP_ETT_ID), entry_id, ERR_PTR(err));
+
+ return err;
+}
+EXPORT_SYMBOL_GPL(ntmp_ett_add_entry);
+
+/**
+ * ntmp_ett_update_entry - update the configuration element data of the
+ * specified entry
+ * @user: target ntmp_user struct
+ * @entry_id: entry ID
+ * @cfge: configuration element data
+ *
+ * Return: 0 on success, otherwise a negative error code
+ */
+int ntmp_ett_update_entry(struct ntmp_user *user, u32 entry_id,
+ const struct ett_cfge_data *cfge)
+{
+ int err;
+
+ err = ntmp_ett_set_entry(user, entry_id, NTMP_CMD_UPDATE, cfge);
+ if (err)
+ dev_err(user->dev,
+ "Failed to update %s entry 0x%x, err: %pe\n",
+ ntmp_table_name(NTMP_ETT_ID), entry_id, ERR_PTR(err));
+
+ return err;
+}
+EXPORT_SYMBOL_GPL(ntmp_ett_update_entry);
+
+/**
+ * ntmp_ett_delete_entry - delete the specified egress treatment table entry
+ * @user: target ntmp_user struct
+ * @entry_id: entry ID
+ *
+ * Return: 0 on success, otherwise a negative error code
+ */
+int ntmp_ett_delete_entry(struct ntmp_user *user, u32 entry_id)
+{
+ return ntmp_delete_entry_by_id(user, NTMP_ETT_ID, user->tbl.ett_ver,
+ entry_id, NTMP_EID_REQ_LEN, 0);
+}
+EXPORT_SYMBOL_GPL(ntmp_ett_delete_entry);
+
int ntmp_bpt_update_entry(struct ntmp_user *user, u32 entry_id,
const struct bpt_cfge_data *cfge)
{
diff --git a/drivers/net/ethernet/freescale/enetc/ntmp_private.h b/drivers/net/ethernet/freescale/enetc/ntmp_private.h
index 9d30f128849a..531ea7ddd145 100644
--- a/drivers/net/ethernet/freescale/enetc/ntmp_private.h
+++ b/drivers/net/ethernet/freescale/enetc/ntmp_private.h
@@ -217,6 +217,14 @@ struct vft_req_qd {
union vft_access_key ak;
};
+/* Egress Treatment Table Request Data Buffer Format of Update and Add
+ * actions
+ */
+struct ett_req_ua {
+ struct ntmp_req_by_eid rbe;
+ struct ett_cfge_data cfge;
+};
+
/* Buffer Pool Table Request Data Buffer Format of Update action */
struct bpt_req_update {
struct ntmp_req_by_eid rbe;
diff --git a/include/linux/fsl/ntmp.h b/include/linux/fsl/ntmp.h
index 36a9089526ad..0c951e1c763d 100644
--- a/include/linux/fsl/ntmp.h
+++ b/include/linux/fsl/ntmp.h
@@ -36,6 +36,7 @@ struct netc_tbl_vers {
u8 vft_ver;
u8 bpt_ver;
u8 ipft_ver;
+ u8 ett_ver;
};
struct netc_swcbd {
@@ -214,6 +215,23 @@ struct vft_cfge_data {
__le32 et_eid;
};
+struct ett_cfge_data {
+ __le16 efm_cfg;
+#define ETT_EFM_MODE GENMASK(1, 0)
+#define ETT_ESQA GENMASK(5, 4)
+#define ETT_ECA GENMASK(8, 6)
+#define ETT_ECA_INC 1
+#define ETT_EFM_LEN_CHANGE GENMASK(15, 9)
+#define ETT_FRM_LEN_DEL_VLAN 0x7c
+#define ETT_FRM_LEN_DEL_RTAG 0x7a
+#define ETT_FRM_LEN_DEL_VLAN_RTAG 0x76
+ __le16 efm_data_len;
+#define ETT_EFM_DATA_LEN GENMASK(10, 0)
+ __le32 efm_eid;
+ __le32 ec_eid;
+ __le32 esqa_tgt_eid;
+};
+
struct bpt_bpse_data {
__le32 amount_used;
__le32 amount_used_hwm;
@@ -271,6 +289,11 @@ int ntmp_vft_add_entry(struct ntmp_user *user, u16 vid,
int ntmp_vft_update_entry(struct ntmp_user *user, u16 vid,
const struct vft_cfge_data *cfge);
int ntmp_vft_delete_entry(struct ntmp_user *user, u16 vid);
+int ntmp_ett_add_entry(struct ntmp_user *user, u32 entry_id,
+ const struct ett_cfge_data *cfge);
+int ntmp_ett_update_entry(struct ntmp_user *user, u32 entry_id,
+ const struct ett_cfge_data *cfge);
+int ntmp_ett_delete_entry(struct ntmp_user *user, u32 entry_id);
int ntmp_bpt_update_entry(struct ntmp_user *user, u32 entry_id,
const struct bpt_cfge_data *cfge);
#else
--
2.34.1
^ permalink raw reply related
* [PATCH v3 net-next 4/9] net: enetc: add "Update" operation to the egress count table
From: wei.fang @ 2026-06-05 1:48 UTC (permalink / raw)
To: claudiu.manoil, vladimir.oltean, xiaoning.wang, andrew+netdev,
davem, edumazet, kuba, pabeni, chleroy, andrew, olteanv, linux
Cc: wei.fang, imx, netdev, linux-kernel, linuxppc-dev,
linux-arm-kernel
In-Reply-To: <20260605014808.686024-1-wei.fang@oss.nxp.com>
From: Wei Fang <wei.fang@nxp.com>
The egress count table is a static bounded index table, egress related
statistics are maintained in this table. The table is implemented as a
linear array of entries accessed using an index (0, 1, 2, ..., n) that
uniquely identifies an entry within the array. Egress Counter Entry ID
(EC_EID) is used as an index to an entry in this table. The EC_EID is
specified in the egress treatment table.
Egress count table entries are always present and enabled. The table
only supports access via entry ID, which is assigned by the software.
And it supports Update, Query and Query followed by Update operations.
Currently, only Update operation is supported.
Signed-off-by: Wei Fang <wei.fang@nxp.com>
---
drivers/net/ethernet/freescale/enetc/ntmp.c | 45 +++++++++++++++++++++
include/linux/fsl/ntmp.h | 2 +
2 files changed, 47 insertions(+)
diff --git a/drivers/net/ethernet/freescale/enetc/ntmp.c b/drivers/net/ethernet/freescale/enetc/ntmp.c
index 9249f78219ed..601435966ed1 100644
--- a/drivers/net/ethernet/freescale/enetc/ntmp.c
+++ b/drivers/net/ethernet/freescale/enetc/ntmp.c
@@ -25,6 +25,7 @@
#define NTMP_FDBT_ID 15
#define NTMP_VFT_ID 18
#define NTMP_ETT_ID 33
+#define NTMP_ECT_ID 39
#define NTMP_BPT_ID 41
/* Generic Update Actions for most tables */
@@ -33,6 +34,7 @@
/* Specific Update Actions for some tables */
#define FDBT_UA_ACTEU BIT(1)
+#define ECT_UA_STSEU BIT(0)
#define BPT_UA_BPSEU BIT(1)
/* Query Action: 0: Full query. 1: Query entry ID, the fields after entry
@@ -287,6 +289,8 @@ static const char *ntmp_table_name(int tbl_id)
return "VLAN Filter Table";
case NTMP_ETT_ID:
return "Egress Treatment Table";
+ case NTMP_ECT_ID:
+ return "Egress Count Table";
case NTMP_BPT_ID:
return "Buffer Pool Table";
default:
@@ -1197,6 +1201,47 @@ int ntmp_ett_delete_entry(struct ntmp_user *user, u32 entry_id)
}
EXPORT_SYMBOL_GPL(ntmp_ett_delete_entry);
+/**
+ * ntmp_ect_update_entry - reset the statistics element data of the
+ * specified egress counter table entry
+ * @user: target ntmp_user struct
+ * @entry_id: entry ID
+ *
+ * Return: 0 on success, otherwise a negative error code
+ */
+int ntmp_ect_update_entry(struct ntmp_user *user, u32 entry_id)
+{
+ struct ntmp_req_by_eid *req;
+ struct netc_swcbd swcbd;
+ struct netc_cbdr *cbdr;
+ union netc_cbd cbd;
+ int err;
+
+ swcbd.size = sizeof(*req);
+ err = ntmp_alloc_data_mem(user->dev, &swcbd, (void **)&req);
+ if (err)
+ return err;
+
+ /* Request data */
+ ntmp_fill_crd_eid(req, user->tbl.ect_ver, 0, ECT_UA_STSEU, entry_id);
+
+ /* Request header */
+ ntmp_fill_request_hdr(&cbd, swcbd.dma, NTMP_LEN(swcbd.size, 0),
+ NTMP_ECT_ID, NTMP_CMD_UPDATE, NTMP_AM_ENTRY_ID);
+
+ ntmp_select_and_lock_cbdr(user, &cbdr);
+ err = netc_xmit_ntmp_cmd(cbdr, &cbd, &swcbd);
+ if (err)
+ dev_err(user->dev,
+ "Failed to update %s entry 0x%x, err: %pe\n",
+ ntmp_table_name(NTMP_ECT_ID), entry_id, ERR_PTR(err));
+
+ ntmp_unlock_cbdr(cbdr);
+
+ return err;
+}
+EXPORT_SYMBOL_GPL(ntmp_ect_update_entry);
+
int ntmp_bpt_update_entry(struct ntmp_user *user, u32 entry_id,
const struct bpt_cfge_data *cfge)
{
diff --git a/include/linux/fsl/ntmp.h b/include/linux/fsl/ntmp.h
index 0c951e1c763d..1222901f48a7 100644
--- a/include/linux/fsl/ntmp.h
+++ b/include/linux/fsl/ntmp.h
@@ -37,6 +37,7 @@ struct netc_tbl_vers {
u8 bpt_ver;
u8 ipft_ver;
u8 ett_ver;
+ u8 ect_ver;
};
struct netc_swcbd {
@@ -294,6 +295,7 @@ int ntmp_ett_add_entry(struct ntmp_user *user, u32 entry_id,
int ntmp_ett_update_entry(struct ntmp_user *user, u32 entry_id,
const struct ett_cfge_data *cfge);
int ntmp_ett_delete_entry(struct ntmp_user *user, u32 entry_id);
+int ntmp_ect_update_entry(struct ntmp_user *user, u32 entry_id);
int ntmp_bpt_update_entry(struct ntmp_user *user, u32 entry_id,
const struct bpt_cfge_data *cfge);
#else
--
2.34.1
^ permalink raw reply related
* [PATCH v3 net-next 5/9] net: dsa: netc: initialize the group bitmap of ETT and ECT
From: wei.fang @ 2026-06-05 1:48 UTC (permalink / raw)
To: claudiu.manoil, vladimir.oltean, xiaoning.wang, andrew+netdev,
davem, edumazet, kuba, pabeni, chleroy, andrew, olteanv, linux
Cc: wei.fang, imx, netdev, linux-kernel, linuxppc-dev,
linux-arm-kernel
In-Reply-To: <20260605014808.686024-1-wei.fang@oss.nxp.com>
From: Wei Fang <wei.fang@nxp.com>
The Egress Treatment Table (ETT) and Egress Count Table (ECT) are both
index tables whose entry IDs are allocated by software. Every num_ports
entries form a group, where each entry in the group corresponds to one
port. To facilitate group allocation and management, initialize the group
index bitmaps for both tables based on hardware capabilities reported by
ETTCAPR and ECTCAPR registers.
The bitmap size per table is calculated as the total number of hardware
entries divided by the number of available ports, which gives the number
of groups available for software allocation. A set bit in the bitmap
represents a group index that has been allocated.
These bitmaps will be used by subsequent patches that add VLAN support.
Signed-off-by: Wei Fang <wei.fang@nxp.com>
---
drivers/net/dsa/netc/netc_main.c | 90 ++++++++++++++++++++++++++-
drivers/net/dsa/netc/netc_switch_hw.h | 6 ++
include/linux/fsl/ntmp.h | 7 +++
3 files changed, 102 insertions(+), 1 deletion(-)
diff --git a/drivers/net/dsa/netc/netc_main.c b/drivers/net/dsa/netc/netc_main.c
index fa7dd307ce13..d4475ad7ed6c 100644
--- a/drivers/net/dsa/netc/netc_main.c
+++ b/drivers/net/dsa/netc/netc_main.c
@@ -323,16 +323,104 @@ static void netc_remove_all_cbdrs(struct netc_switch *priv)
ntmp_free_cbdr(&ntmp->ring[i]);
}
+static u32 netc_num_available_ports(struct netc_switch *priv)
+{
+ struct dsa_port *dp;
+ u32 num_ports = 0;
+
+ dsa_switch_for_each_available_port(dp, priv->ds)
+ num_ports++;
+
+ return num_ports;
+}
+
+static int netc_init_ntmp_bitmap_sizes(struct netc_switch *priv)
+{
+ u32 num_ports = netc_num_available_ports(priv);
+ struct netc_switch_regs *regs = &priv->regs;
+ struct ntmp_user *ntmp = &priv->ntmp;
+ u32 val;
+
+ if (!num_ports)
+ return -EINVAL;
+
+ val = netc_base_rd(regs, NETC_ETTCAPR);
+ ntmp->ett_bitmap_size = NETC_GET_NUM_ENTRIES(val) / num_ports;
+ if (!ntmp->ett_bitmap_size)
+ return -EINVAL;
+
+ val = netc_base_rd(regs, NETC_ECTCAPR);
+ ntmp->ect_bitmap_size = NETC_GET_NUM_ENTRIES(val) / num_ports;
+ if (!ntmp->ect_bitmap_size)
+ return -EINVAL;
+
+ return 0;
+}
+
+static int netc_init_ntmp_bitmaps(struct netc_switch *priv)
+{
+ struct ntmp_user *ntmp = &priv->ntmp;
+
+ ntmp->ett_gid_bitmap = bitmap_zalloc(ntmp->ett_bitmap_size,
+ GFP_KERNEL);
+ if (!ntmp->ett_gid_bitmap)
+ return -ENOMEM;
+
+ ntmp->ect_gid_bitmap = bitmap_zalloc(ntmp->ect_bitmap_size,
+ GFP_KERNEL);
+ if (!ntmp->ect_gid_bitmap)
+ goto free_ett_gid_bitmap;
+
+ return 0;
+
+free_ett_gid_bitmap:
+ bitmap_free(ntmp->ett_gid_bitmap);
+ ntmp->ett_gid_bitmap = NULL;
+
+ return -ENOMEM;
+}
+
+static void netc_free_ntmp_bitmaps(struct netc_switch *priv)
+{
+ struct ntmp_user *ntmp = &priv->ntmp;
+
+ bitmap_free(ntmp->ect_gid_bitmap);
+ ntmp->ect_gid_bitmap = NULL;
+
+ bitmap_free(ntmp->ett_gid_bitmap);
+ ntmp->ett_gid_bitmap = NULL;
+}
+
static int netc_init_ntmp_user(struct netc_switch *priv)
{
+ int err;
+
netc_init_ntmp_tbl_versions(priv);
- return netc_init_all_cbdrs(priv);
+ err = netc_init_ntmp_bitmap_sizes(priv);
+ if (err)
+ return err;
+
+ err = netc_init_ntmp_bitmaps(priv);
+ if (err)
+ return err;
+
+ err = netc_init_all_cbdrs(priv);
+ if (err)
+ goto free_ntmp_bitmaps;
+
+ return 0;
+
+free_ntmp_bitmaps:
+ netc_free_ntmp_bitmaps(priv);
+
+ return err;
}
static void netc_free_ntmp_user(struct netc_switch *priv)
{
netc_remove_all_cbdrs(priv);
+ netc_free_ntmp_bitmaps(priv);
}
static void netc_switch_dos_default_config(struct netc_switch *priv)
diff --git a/drivers/net/dsa/netc/netc_switch_hw.h b/drivers/net/dsa/netc/netc_switch_hw.h
index 1d976882a6cc..1404ae41c7bc 100644
--- a/drivers/net/dsa/netc/netc_switch_hw.h
+++ b/drivers/net/dsa/netc/netc_switch_hw.h
@@ -36,6 +36,12 @@
#define DOSL3CR_SAMEADDR BIT(0)
#define DOSL3CR_IPSAMCC BIT(1)
+#define NETC_ETTCAPR 0x18c4
+#define NETC_ECTCAPR 0x18ec
+/* Index table NUM_ENTRIES mask */
+#define NETC_NUM_ENTRIES GENMASK(15, 0)
+#define NETC_GET_NUM_ENTRIES(v) FIELD_GET(NETC_NUM_ENTRIES, (v))
+
/* Hash table memory capability register, the memory is shared by
* the following tables:
*
diff --git a/include/linux/fsl/ntmp.h b/include/linux/fsl/ntmp.h
index 1222901f48a7..e8b1bd802f19 100644
--- a/include/linux/fsl/ntmp.h
+++ b/include/linux/fsl/ntmp.h
@@ -3,6 +3,7 @@
#ifndef __NETC_NTMP_H
#define __NETC_NTMP_H
+#include <linux/bitmap.h>
#include <linux/bitops.h>
#include <linux/if_ether.h>
@@ -70,6 +71,12 @@ struct ntmp_user {
struct device *dev;
struct netc_cbdr *ring;
struct netc_tbl_vers tbl;
+
+ /* NTMP table bitmaps for resource management */
+ u32 ett_bitmap_size;
+ u32 ect_bitmap_size;
+ unsigned long *ett_gid_bitmap; /* only valid for switch */
+ unsigned long *ect_gid_bitmap; /* only valid for switch */
};
struct maft_entry_data {
--
2.34.1
^ permalink raw reply related
* [PATCH v3 net-next 6/9] net: enetc: add helpers to set/clear table bitmap
From: wei.fang @ 2026-06-05 1:48 UTC (permalink / raw)
To: claudiu.manoil, vladimir.oltean, xiaoning.wang, andrew+netdev,
davem, edumazet, kuba, pabeni, chleroy, andrew, olteanv, linux
Cc: wei.fang, imx, netdev, linux-kernel, linuxppc-dev,
linux-arm-kernel
In-Reply-To: <20260605014808.686024-1-wei.fang@oss.nxp.com>
From: Wei Fang <wei.fang@nxp.com>
NTMP index tables require software to allocate and manage entry IDs.
Add two bitmap helper functions to facilitate this management:
ntmp_lookup_free_eid(): finds the first zero bit in the given bitmap,
sets it to mark the entry as in-use, and returns the corresponding entry
ID. Returns NTMP_NULL_ENTRY_ID if no free entry is available.
ntmp_clear_eid_bitmap(): clears the bit associated with the given entry
ID in the bitmap to mark the entry as free. It is a no-op if the entry
ID is NTMP_NULL_ENTRY_ID.
Both functions are exported for use by other modules, such as the NETC
switch driver which needs to manage group index bitmaps for the Egress
Treatment Table (ETT) and Egress Count Table (ECT).
Signed-off-by: Wei Fang <wei.fang@nxp.com>
---
drivers/net/ethernet/freescale/enetc/ntmp.c | 24 +++++++++++++++++++++
include/linux/fsl/ntmp.h | 2 ++
2 files changed, 26 insertions(+)
diff --git a/drivers/net/ethernet/freescale/enetc/ntmp.c b/drivers/net/ethernet/freescale/enetc/ntmp.c
index 601435966ed1..9f38f885ebb5 100644
--- a/drivers/net/ethernet/freescale/enetc/ntmp.c
+++ b/drivers/net/ethernet/freescale/enetc/ntmp.c
@@ -47,6 +47,30 @@
#define RSST_STSE_DATA_SIZE(n) ((n) * 8)
#define RSST_CFGE_DATA_SIZE(n) (n)
+u32 ntmp_lookup_free_eid(unsigned long *bitmap, u32 size)
+{
+ u32 entry_id;
+
+ entry_id = find_first_zero_bit(bitmap, size);
+ if (entry_id == size)
+ return NTMP_NULL_ENTRY_ID;
+
+ /* Set the bit once we found it */
+ set_bit(entry_id, bitmap);
+
+ return entry_id;
+}
+EXPORT_SYMBOL_GPL(ntmp_lookup_free_eid);
+
+void ntmp_clear_eid_bitmap(unsigned long *bitmap, u32 entry_id)
+{
+ if (entry_id == NTMP_NULL_ENTRY_ID)
+ return;
+
+ clear_bit(entry_id, bitmap);
+}
+EXPORT_SYMBOL_GPL(ntmp_clear_eid_bitmap);
+
int ntmp_init_cbdr(struct netc_cbdr *cbdr, struct device *dev,
const struct netc_cbdr_regs *regs)
{
diff --git a/include/linux/fsl/ntmp.h b/include/linux/fsl/ntmp.h
index e8b1bd802f19..4d329488763d 100644
--- a/include/linux/fsl/ntmp.h
+++ b/include/linux/fsl/ntmp.h
@@ -266,6 +266,8 @@ struct bpt_cfge_data {
int ntmp_init_cbdr(struct netc_cbdr *cbdr, struct device *dev,
const struct netc_cbdr_regs *regs);
void ntmp_free_cbdr(struct netc_cbdr *cbdr);
+u32 ntmp_lookup_free_eid(unsigned long *bitmap, u32 size);
+void ntmp_clear_eid_bitmap(unsigned long *bitmap, u32 entry_id);
/* NTMP APIs */
int ntmp_maft_add_entry(struct ntmp_user *user, u32 entry_id,
--
2.34.1
^ permalink raw reply related
* [PATCH v3 net-next 7/9] net: dsa: netc: add VLAN filter table and egress treatment management
From: wei.fang @ 2026-06-05 1:48 UTC (permalink / raw)
To: claudiu.manoil, vladimir.oltean, xiaoning.wang, andrew+netdev,
davem, edumazet, kuba, pabeni, chleroy, andrew, olteanv, linux
Cc: wei.fang, imx, netdev, linux-kernel, linuxppc-dev,
linux-arm-kernel
In-Reply-To: <20260605014808.686024-1-wei.fang@oss.nxp.com>
From: Wei Fang <wei.fang@nxp.com>
Implement the DSA .port_vlan_add and .port_vlan_del operations to enable
VLAN-aware bridge offloading on the NETC switch.
VLAN membership is maintained in the VLAN Filter Table (VFT). Adding the
first port to a VLAN creates a new VFT entry with hardware MAC learning
and flood-on-miss forwarding; subsequent ports update the existing
entry's membership bitmap. Removing the last port deletes the entry.
Egress tagging is handled through the Egress Treatment Table (ETT). Each
VLAN is allocated a group of ETT entries, one per available port. Ports
are assigned a sequential ett_offset during initialisation, used to
address each port's entry within the group. Untagged ports configure the
ETT to strip the outer VLAN tag; tagged ports pass frames through
unmodified. Each ETT group is optionally paired with an Egress Counter
Table (ECT) group for per-port frame counting, allocated on a best-effort
basis. When the egress rule of an ETT entry changes, the counter of the
corresponding ECT entry will be recounted to track the number of frames
that match the new egress rule.
A software shadow list serialised by vft_lock tracks active VLAN state
across both port membership and egress tagging. VID 0 is used for single
port mode and is ignored by both callbacks.
Signed-off-by: Wei Fang <wei.fang@nxp.com>
---
drivers/net/dsa/netc/netc_main.c | 438 +++++++++++++++++++++++++++++
drivers/net/dsa/netc/netc_switch.h | 24 ++
include/linux/fsl/ntmp.h | 15 +
3 files changed, 477 insertions(+)
diff --git a/drivers/net/dsa/netc/netc_main.c b/drivers/net/dsa/netc/netc_main.c
index d4475ad7ed6c..1e1873528ccc 100644
--- a/drivers/net/dsa/netc/netc_main.c
+++ b/drivers/net/dsa/netc/netc_main.c
@@ -37,6 +37,27 @@ static void netc_destroy_fdb_list(struct netc_switch *priv)
netc_del_fdb_entry(entry);
}
+static struct netc_vlan_entry *
+netc_lookup_vlan_entry(struct netc_switch *priv, u16 vid)
+{
+ struct netc_vlan_entry *entry;
+
+ hlist_for_each_entry(entry, &priv->vlan_list, node)
+ if (entry->vid == vid)
+ return entry;
+
+ return NULL;
+}
+
+static void netc_destroy_vlan_list(struct netc_switch *priv)
+{
+ struct netc_vlan_entry *entry;
+ struct hlist_node *tmp;
+
+ hlist_for_each_entry_safe(entry, tmp, &priv->vlan_list, node)
+ netc_del_vlan_entry(entry);
+}
+
static enum dsa_tag_protocol
netc_get_tag_protocol(struct dsa_switch *ds, int port,
enum dsa_tag_protocol mprot)
@@ -222,6 +243,7 @@ static int netc_init_all_ports(struct netc_switch *priv)
struct device *dev = priv->dev;
struct netc_port *np;
struct dsa_port *dp;
+ int ett_offset = 0;
int err;
priv->ports = devm_kcalloc(dev, priv->info->num_ports,
@@ -251,6 +273,8 @@ static int netc_init_all_ports(struct netc_switch *priv)
dsa_switch_for_each_available_port(dp, priv->ds) {
np = priv->ports[dp->index];
np->dp = dp;
+ np->ett_offset = ett_offset++;
+ priv->port_bitmap |= BIT(dp->index);
err = netc_port_get_info_from_dt(np, dp->dn, dev);
if (err)
@@ -831,6 +855,8 @@ static int netc_setup(struct dsa_switch *ds)
INIT_HLIST_HEAD(&priv->fdb_list);
mutex_init(&priv->fdbt_lock);
+ INIT_HLIST_HEAD(&priv->vlan_list);
+ mutex_init(&priv->vft_lock);
netc_switch_fixed_config(priv);
@@ -858,6 +884,7 @@ static int netc_setup(struct dsa_switch *ds)
* hardware state.
*/
mutex_destroy(&priv->fdbt_lock);
+ mutex_destroy(&priv->vft_lock);
netc_free_ntmp_user(priv);
return err;
@@ -867,6 +894,8 @@ static void netc_destroy_all_lists(struct netc_switch *priv)
{
netc_destroy_fdb_list(priv);
mutex_destroy(&priv->fdbt_lock);
+ netc_destroy_vlan_list(priv);
+ mutex_destroy(&priv->vft_lock);
}
static void netc_free_host_flood_rules(struct netc_switch *priv)
@@ -1025,6 +1054,384 @@ static void netc_switch_get_ip_revision(struct netc_switch *priv)
priv->revision = FIELD_GET(IPBRR0_IP_REV, val);
}
+static void netc_init_ett_cfge(struct ett_cfge_data *cfge,
+ bool untagged, u32 ect_eid)
+{
+ u32 vuda_sqta = FMTEID_VUDA_SQTA;
+ u16 efm_cfg = 0;
+
+ if (ect_eid != NTMP_NULL_ENTRY_ID) {
+ /* Increase egress frame counter */
+ efm_cfg |= FIELD_PREP(ETT_ECA, ETT_ECA_INC);
+ cfge->ec_eid = cpu_to_le32(ect_eid);
+ }
+
+ /* If egress rule is VLAN untagged */
+ if (untagged) {
+ /* delete outer VLAN tag */
+ vuda_sqta |= FIELD_PREP(FMTEID_VUDA, FMTEID_VUDA_DEL_OTAG);
+ /* length change: twos-complement notation */
+ efm_cfg |= FIELD_PREP(ETT_EFM_LEN_CHANGE,
+ ETT_FRM_LEN_DEL_VLAN);
+ }
+
+ cfge->efm_eid = cpu_to_le32(vuda_sqta);
+ cfge->efm_cfg = cpu_to_le16(efm_cfg);
+}
+
+static int netc_add_ett_entry(struct netc_switch *priv, bool untagged,
+ u32 ett_eid, u32 ect_eid)
+{
+ struct ntmp_user *ntmp = &priv->ntmp;
+ struct ett_cfge_data cfge = {};
+
+ netc_init_ett_cfge(&cfge, untagged, ect_eid);
+
+ return ntmp_ett_add_entry(ntmp, ett_eid, &cfge);
+}
+
+static int netc_update_ett_entry(struct netc_switch *priv, bool untagged,
+ u32 ett_eid, u32 ect_eid)
+{
+ struct ntmp_user *ntmp = &priv->ntmp;
+ struct ett_cfge_data cfge = {};
+
+ netc_init_ett_cfge(&cfge, untagged, ect_eid);
+
+ return ntmp_ett_update_entry(ntmp, ett_eid, &cfge);
+}
+
+static int netc_add_ett_group_entries(struct netc_switch *priv,
+ u32 untagged_port_bitmap,
+ u32 ett_base_eid,
+ u32 ect_base_eid)
+{
+ struct netc_port **ports = priv->ports;
+ u32 ett_eid, ect_eid;
+ bool untagged;
+ int i, err;
+
+ for (i = 0; i < priv->info->num_ports; i++) {
+ if (!ports[i]->dp)
+ continue;
+
+ untagged = !!(untagged_port_bitmap & BIT(i));
+ ett_eid = ett_base_eid + ports[i]->ett_offset;
+ ect_eid = NTMP_NULL_ENTRY_ID;
+ if (ect_base_eid != NTMP_NULL_ENTRY_ID)
+ ect_eid = ect_base_eid + ports[i]->ett_offset;
+
+ err = netc_add_ett_entry(priv, untagged, ett_eid, ect_eid);
+ if (err)
+ goto clear_ett_entries;
+ }
+
+ return 0;
+
+clear_ett_entries:
+ while (--i >= 0) {
+ if (!ports[i]->dp)
+ continue;
+
+ ett_eid = ett_base_eid + ports[i]->ett_offset;
+ ntmp_ett_delete_entry(&priv->ntmp, ett_eid);
+ }
+
+ return err;
+}
+
+static int netc_add_vlan_egress_rule(struct netc_switch *priv,
+ struct netc_vlan_entry *entry)
+{
+ u32 num_ports = netc_num_available_ports(priv);
+ struct ntmp_user *ntmp = &priv->ntmp;
+ u32 ect_eid = NTMP_NULL_ENTRY_ID;
+ u32 ett_eid, ett_gid, ect_gid;
+ int err;
+
+ /* Step 1: Find available egress counter table entries and update
+ * these entries.
+ */
+ ect_gid = ntmp_lookup_free_eid(ntmp->ect_gid_bitmap,
+ ntmp->ect_bitmap_size);
+ if (ect_gid == NTMP_NULL_ENTRY_ID) {
+ dev_info(priv->dev,
+ "No egress counter table entries available\n");
+ } else {
+ ect_eid = ect_gid * num_ports;
+ for (int i = 0; i < num_ports; i++)
+ /* There is no need to check the return value, the only
+ * issue is that the entry's counter might be inaccurate,
+ * but it will not affect the functionality, it is only
+ * for future debugging.
+ */
+ ntmp_ect_update_entry(ntmp, ect_eid + i);
+ }
+
+ /* Step 2: Find available egress treatment table entries and add
+ * these entries.
+ */
+ ett_gid = ntmp_lookup_free_eid(ntmp->ett_gid_bitmap,
+ ntmp->ett_bitmap_size);
+ if (ett_gid == NTMP_NULL_ENTRY_ID) {
+ dev_err(priv->dev,
+ "No egress treatment table entries available\n");
+ err = -ENOSPC;
+ goto clear_ect_gid;
+ }
+
+ ett_eid = ett_gid * num_ports;
+ err = netc_add_ett_group_entries(priv, entry->untagged_port_bitmap,
+ ett_eid, ect_eid);
+ if (err)
+ goto clear_ett_gid;
+
+ entry->cfge.et_eid = cpu_to_le32(ett_eid);
+ entry->ect_gid = ect_gid;
+
+ return 0;
+
+clear_ett_gid:
+ ntmp_clear_eid_bitmap(ntmp->ett_gid_bitmap, ett_gid);
+
+clear_ect_gid:
+ if (ect_gid != NTMP_NULL_ENTRY_ID)
+ ntmp_clear_eid_bitmap(ntmp->ect_gid_bitmap, ect_gid);
+
+ return err;
+}
+
+static void netc_delete_vlan_egress_rule(struct netc_switch *priv,
+ struct netc_vlan_entry *entry)
+{
+ u32 num_ports = netc_num_available_ports(priv);
+ struct ntmp_user *ntmp = &priv->ntmp;
+ u32 ett_eid, ett_gid;
+
+ ett_eid = le32_to_cpu(entry->cfge.et_eid);
+ if (ett_eid == NTMP_NULL_ENTRY_ID)
+ return;
+
+ ett_gid = ett_eid / num_ports;
+ ntmp_clear_eid_bitmap(ntmp->ett_gid_bitmap, ett_gid);
+ for (int i = 0; i < num_ports; i++)
+ ntmp_ett_delete_entry(ntmp, ett_eid + i);
+
+ if (entry->ect_gid == NTMP_NULL_ENTRY_ID)
+ return;
+
+ ntmp_clear_eid_bitmap(ntmp->ect_gid_bitmap, entry->ect_gid);
+}
+
+static int netc_port_update_vlan_egress_rule(struct netc_port *np,
+ struct netc_vlan_entry *entry)
+{
+ bool untagged = !!(entry->untagged_port_bitmap & BIT(np->dp->index));
+ u32 num_ports = netc_num_available_ports(np->switch_priv);
+ u32 ett_eid = le32_to_cpu(entry->cfge.et_eid);
+ struct netc_switch *priv = np->switch_priv;
+ u32 ect_eid = NTMP_NULL_ENTRY_ID;
+ int err;
+
+ if (ett_eid == NTMP_NULL_ENTRY_ID)
+ return 0;
+
+ ett_eid += np->ett_offset;
+ err = netc_update_ett_entry(priv, untagged, ett_eid, ect_eid);
+ if (err) {
+ dev_err(priv->dev,
+ "Failed to update VLAN %u egress rule on port %d\n",
+ entry->vid, np->dp->index);
+ return err;
+ }
+
+ if (entry->ect_gid != NTMP_NULL_ENTRY_ID) {
+ /* Each ETT entry maps to an ECT entry if ect_gid is not NULL
+ * entry ID. The offset of the ECT entry corresponding to the
+ * port in the group is equal to ett_offset.
+ */
+ ect_eid = entry->ect_gid * num_ports + np->ett_offset;
+ ntmp_ect_update_entry(&priv->ntmp, ect_eid);
+ }
+
+ return 0;
+}
+
+static int netc_port_add_vlan_entry(struct netc_port *np, u16 vid,
+ bool untagged)
+{
+ struct netc_switch *priv = np->switch_priv;
+ struct netc_vlan_entry *entry;
+ struct vft_cfge_data *cfge;
+ u32 index = np->dp->index;
+ u32 bitmap_stg;
+ int err;
+ u16 cfg;
+
+ entry = kzalloc_obj(*entry);
+ if (!entry)
+ return -ENOMEM;
+
+ entry->vid = vid;
+ entry->ect_gid = NTMP_NULL_ENTRY_ID;
+
+ bitmap_stg = BIT(index) | VFT_STG_ID(0);
+ cfg = FIELD_PREP(VFT_MLO, MLO_HW) |
+ FIELD_PREP(VFT_MFO, MFO_NO_MATCH_FLOOD);
+
+ cfge = &entry->cfge;
+ cfge->et_eid = cpu_to_le32(NTMP_NULL_ENTRY_ID);
+ cfge->bitmap_stg = cpu_to_le32(bitmap_stg);
+ cfge->fid = cpu_to_le16(vid);
+ cfge->cfg = cpu_to_le16(cfg);
+ cfge->eta_port_bitmap = cpu_to_le32(priv->port_bitmap);
+
+ if (untagged)
+ entry->untagged_port_bitmap = BIT(index);
+
+ err = netc_add_vlan_egress_rule(priv, entry);
+ if (err)
+ goto free_vlan_entry;
+
+ err = ntmp_vft_add_entry(&priv->ntmp, vid, cfge);
+ if (err) {
+ dev_err(priv->dev,
+ "Failed to add VLAN %u entry on port %d\n",
+ vid, index);
+ goto delete_vlan_egress_rule;
+ }
+
+ netc_add_vlan_entry(priv, entry);
+
+ return 0;
+
+delete_vlan_egress_rule:
+ netc_delete_vlan_egress_rule(priv, entry);
+free_vlan_entry:
+ kfree(entry);
+
+ return err;
+}
+
+static bool netc_port_vlan_egress_rule_changed(struct netc_vlan_entry *entry,
+ int port, bool untagged)
+{
+ bool old_untagged = !!(entry->untagged_port_bitmap & BIT(port));
+
+ return old_untagged != untagged;
+}
+
+static int netc_port_set_vlan_entry(struct netc_port *np, u16 vid,
+ bool untagged)
+{
+ struct netc_switch *priv = np->switch_priv;
+ struct netc_vlan_entry *entry;
+ struct vft_cfge_data *cfge;
+ int port = np->dp->index;
+ bool changed;
+ int err = 0;
+
+ mutex_lock(&priv->vft_lock);
+
+ entry = netc_lookup_vlan_entry(priv, vid);
+ if (!entry) {
+ err = netc_port_add_vlan_entry(np, vid, untagged);
+ goto unlock_vft;
+ }
+
+ /* Check whether the egress VLAN rule is changed */
+ changed = netc_port_vlan_egress_rule_changed(entry, port, untagged);
+ if (changed) {
+ entry->untagged_port_bitmap ^= BIT(port);
+ err = netc_port_update_vlan_egress_rule(np, entry);
+ if (err) {
+ entry->untagged_port_bitmap ^= BIT(port);
+ goto unlock_vft;
+ }
+ }
+
+ cfge = &entry->cfge;
+ if (cfge->bitmap_stg & cpu_to_le32(BIT(port)))
+ goto unlock_vft;
+
+ cfge->bitmap_stg |= cpu_to_le32(BIT(port));
+ err = ntmp_vft_update_entry(&priv->ntmp, vid, cfge);
+ if (err) {
+ dev_err(priv->dev,
+ "Failed to update VLAN %u entry on port %d\n",
+ vid, port);
+
+ goto restore_bitmap_stg;
+ }
+
+ mutex_unlock(&priv->vft_lock);
+
+ return 0;
+
+restore_bitmap_stg:
+ cfge->bitmap_stg &= cpu_to_le32(~BIT(port));
+ if (changed) {
+ entry->untagged_port_bitmap ^= BIT(port);
+ /* Recover the corresponding ETT entry. It doesn't matter
+ * if it fails because the bit corresponding to the port
+ * in the port bitmap of the VFT entry is not set. so the
+ * frame will not match that ETT entry.
+ */
+ if (netc_port_update_vlan_egress_rule(np, entry))
+ entry->untagged_port_bitmap ^= BIT(port);
+ }
+unlock_vft:
+ mutex_unlock(&priv->vft_lock);
+
+ return err;
+}
+
+static int netc_port_del_vlan_entry(struct netc_port *np, u16 vid)
+{
+ struct netc_switch *priv = np->switch_priv;
+ struct netc_vlan_entry *entry;
+ struct vft_cfge_data *cfge;
+ int port = np->dp->index;
+ u32 vlan_port_bitmap;
+ int err = 0;
+
+ mutex_lock(&priv->vft_lock);
+
+ entry = netc_lookup_vlan_entry(priv, vid);
+ if (!entry)
+ goto unlock_vft;
+
+ cfge = &entry->cfge;
+ vlan_port_bitmap = FIELD_GET(VFT_PORT_MEMBERSHIP,
+ le32_to_cpu(cfge->bitmap_stg));
+ /* If the VLAN only belongs to the current port */
+ if (vlan_port_bitmap == BIT(port)) {
+ err = ntmp_vft_delete_entry(&priv->ntmp, vid);
+ if (err)
+ goto unlock_vft;
+
+ netc_delete_vlan_egress_rule(priv, entry);
+ netc_del_vlan_entry(entry);
+
+ goto unlock_vft;
+ }
+
+ if (!(vlan_port_bitmap & BIT(port)))
+ goto unlock_vft;
+
+ cfge->bitmap_stg &= cpu_to_le32(~BIT(port));
+ err = ntmp_vft_update_entry(&priv->ntmp, vid, cfge);
+ if (err) {
+ cfge->bitmap_stg |= cpu_to_le32(BIT(port));
+ goto unlock_vft;
+ }
+
+unlock_vft:
+ mutex_unlock(&priv->vft_lock);
+
+ return err;
+}
+
static int netc_port_enable(struct dsa_switch *ds, int port,
struct phy_device *phy)
{
@@ -1297,6 +1704,35 @@ static void netc_port_set_host_flood(struct dsa_switch *ds, int port,
netc_port_remove_host_flood(np, old_host_flood);
}
+static int netc_port_vlan_add(struct dsa_switch *ds, int port,
+ const struct switchdev_obj_port_vlan *vlan,
+ struct netlink_ext_ack *extack)
+{
+ struct netc_port *np = NETC_PORT(ds, port);
+ bool untagged;
+
+ /* The 8021q layer may attempt to change NETC_STANDALONE_PVID
+ * (VID 0), so we need to ignore it.
+ */
+ if (vlan->vid == NETC_STANDALONE_PVID)
+ return 0;
+
+ untagged = !!(vlan->flags & BRIDGE_VLAN_INFO_UNTAGGED);
+
+ return netc_port_set_vlan_entry(np, vlan->vid, untagged);
+}
+
+static int netc_port_vlan_del(struct dsa_switch *ds, int port,
+ const struct switchdev_obj_port_vlan *vlan)
+{
+ struct netc_port *np = NETC_PORT(ds, port);
+
+ if (vlan->vid == NETC_STANDALONE_PVID)
+ return 0;
+
+ return netc_port_del_vlan_entry(np, vlan->vid);
+}
+
static void netc_phylink_get_caps(struct dsa_switch *ds, int port,
struct phylink_config *config)
{
@@ -1575,6 +2011,8 @@ static const struct dsa_switch_ops netc_switch_ops = {
.port_mdb_add = netc_port_mdb_add,
.port_mdb_del = netc_port_mdb_del,
.port_set_host_flood = netc_port_set_host_flood,
+ .port_vlan_add = netc_port_vlan_add,
+ .port_vlan_del = netc_port_vlan_del,
.get_pause_stats = netc_port_get_pause_stats,
.get_rmon_stats = netc_port_get_rmon_stats,
.get_eth_ctrl_stats = netc_port_get_eth_ctrl_stats,
diff --git a/drivers/net/dsa/netc/netc_switch.h b/drivers/net/dsa/netc/netc_switch.h
index 4fbd12825b67..9ff334301fbc 100644
--- a/drivers/net/dsa/netc/netc_switch.h
+++ b/drivers/net/dsa/netc/netc_switch.h
@@ -74,6 +74,7 @@ struct netc_port {
struct dsa_port *dp;
struct clk *ref_clk; /* RGMII/RMII reference clock */
struct mii_bus *emdio;
+ int ett_offset;
u16 enable:1;
u16 uc:1;
@@ -94,6 +95,14 @@ struct netc_fdb_entry {
struct hlist_node node;
};
+struct netc_vlan_entry {
+ u16 vid;
+ u32 ect_gid;
+ u32 untagged_port_bitmap;
+ struct vft_cfge_data cfge;
+ struct hlist_node node;
+};
+
struct netc_port_stat {
int reg;
char name[ETH_GSTRING_LEN] __nonstring;
@@ -108,10 +117,13 @@ struct netc_switch {
const struct netc_switch_info *info;
struct netc_switch_regs regs;
struct netc_port **ports;
+ u32 port_bitmap; /* bitmap of available ports */
struct ntmp_user ntmp;
struct hlist_head fdb_list;
struct mutex fdbt_lock; /* FDB table lock */
+ struct hlist_head vlan_list;
+ struct mutex vft_lock; /* VLAN filter table lock */
/* Switch hardware capabilities */
u32 htmcapr_num_words;
@@ -153,6 +165,18 @@ static inline void netc_del_fdb_entry(struct netc_fdb_entry *entry)
kfree(entry);
}
+static inline void netc_add_vlan_entry(struct netc_switch *priv,
+ struct netc_vlan_entry *entry)
+{
+ hlist_add_head(&entry->node, &priv->vlan_list);
+}
+
+static inline void netc_del_vlan_entry(struct netc_vlan_entry *entry)
+{
+ hlist_del(&entry->node);
+ kfree(entry);
+}
+
int netc_switch_platform_probe(struct netc_switch *priv);
/* ethtool APIs */
diff --git a/include/linux/fsl/ntmp.h b/include/linux/fsl/ntmp.h
index 4d329488763d..d3b6c476b91a 100644
--- a/include/linux/fsl/ntmp.h
+++ b/include/linux/fsl/ntmp.h
@@ -262,6 +262,21 @@ struct bpt_cfge_data {
__le32 fc_ports;
};
+union ntmp_fmt_eid {
+ __le32 index;
+#define FMTEID_INDEX GENMASK(12, 0)
+ __le32 vuda_sqta;
+#define FMTEID_VUDA GENMASK(1, 0)
+#define FMTEID_VUDA_DEL_OTAG 2
+#define FMTEID_SQTA GENMASK(4, 2)
+#define FMTEID_SQTA_DEL 2
+#define FMTEID_VUDA_SQTA BIT(13)
+ __le32 vara_vid;
+#define FMTEID_VID GENMASK(11, 0)
+#define FMTEID_VARA GENMASK(13, 12)
+#define FMTEID_VARA_VID BIT(14)
+};
+
#if IS_ENABLED(CONFIG_NXP_NETC_LIB)
int ntmp_init_cbdr(struct netc_cbdr *cbdr, struct device *dev,
const struct netc_cbdr_regs *regs);
--
2.34.1
^ permalink raw reply related
* [PATCH v3 net-next 8/9] net: dsa: netc: add bridge mode support
From: wei.fang @ 2026-06-05 1:48 UTC (permalink / raw)
To: claudiu.manoil, vladimir.oltean, xiaoning.wang, andrew+netdev,
davem, edumazet, kuba, pabeni, chleroy, andrew, olteanv, linux
Cc: wei.fang, imx, netdev, linux-kernel, linuxppc-dev,
linux-arm-kernel
In-Reply-To: <20260605014808.686024-1-wei.fang@oss.nxp.com>
From: Wei Fang <wei.fang@nxp.com>
Wire up the port_bridge_join, port_bridge_leave and port_vlan_filtering
DSA callbacks to support both VLAN-unaware and VLAN-aware bridge modes.
For VLAN-unaware bridges, each bridge instance is assigned a dedicated
internal PVID via NETC_VLAN_UNAWARE_PVID(bridge.num), counting down
from VID 4095. A VFT entry is created for this PVID with hardware MAC
learning and flood-on-miss forwarding enabled. The CPU port is included
as a VFT member so frames can reach the host. The reserved VID range is
blocked in port_vlan_add to prevent user-space conflicts.
Only one VLAN-aware bridge is supported at a time; this constraint is
enforced in port_bridge_join and port_vlan_filtering. The per-port PVID
is tracked in software and written to the BPDVR register whenever VLAN
filtering is active.
When a port leaves the bridge, its dynamic FDB entries are flushed right
away in port_bridge_leave(), without waiting for the ageing cycle. When
a link down event occurs on a port, netc_mac_link_down() will also clear
the port's dynamic FDB entries via netc_port_remove_dynamic_entries().
Non-bridge ports have no dynamic FDB entries, so this call is always
safe. Additionally, .port_fast_age() callback is added to flush the
dynamic FDB entries associated to a port.
Host flood rules are removed from the ingress port filter table when a
port joins a bridge to avoid bypassing FDB lookup and MAC learning.
Signed-off-by: Wei Fang <wei.fang@nxp.com>
---
drivers/net/dsa/netc/netc_main.c | 353 +++++++++++++++++++++++++++--
drivers/net/dsa/netc/netc_switch.h | 2 +
2 files changed, 339 insertions(+), 16 deletions(-)
diff --git a/drivers/net/dsa/netc/netc_main.c b/drivers/net/dsa/netc/netc_main.c
index 1e1873528ccc..b61152c16e7b 100644
--- a/drivers/net/dsa/netc/netc_main.c
+++ b/drivers/net/dsa/netc/netc_main.c
@@ -509,6 +509,17 @@ static void netc_port_set_mlo(struct netc_port *np, enum netc_mlo mlo)
netc_port_rmw(np, NETC_BPCR, BPCR_MLO, FIELD_PREP(BPCR_MLO, mlo));
}
+static void netc_port_set_pvid(struct netc_port *np, u16 pvid)
+{
+ netc_port_rmw(np, NETC_BPDVR, BPDVR_VID, pvid);
+}
+
+static void netc_port_set_vlan_aware(struct netc_port *np, bool aware)
+{
+ netc_port_rmw(np, NETC_BPDVR, BPDVR_RXVAM,
+ aware ? 0 : BPDVR_RXVAM);
+}
+
static void netc_port_fixed_config(struct netc_port *np)
{
/* Default IPV and DR setting */
@@ -534,7 +545,7 @@ static void netc_port_default_config(struct netc_port *np)
netc_port_fixed_config(np);
/* Default VLAN unaware */
- netc_port_rmw(np, NETC_BPDVR, BPDVR_RXVAM, BPDVR_RXVAM);
+ netc_port_set_vlan_aware(np, false);
if (dsa_port_is_cpu(np->dp))
/* For CPU port, source port pruning is disabled */
@@ -695,10 +706,16 @@ static int netc_port_del_fdb_entry(struct netc_port *np,
entry = netc_lookup_fdb_entry(priv, addr, vid);
if (unlikely(!entry))
- /* Currently only single port mode is supported, MAC learning
- * is disabled, so there is no dynamically learned FDB entry.
- * We need to support deleting dynamically FDB entry when the
- * bridge mode is supported.
+ /* The hardware-learned dynamic FDB entries cannot be deleted
+ * through .port_fdb_del() interface.
+ * For NTF_MASTER path: Since hardware-learned dynamic FDB
+ * entries are never synchronized back to the bridge software
+ * database. br_fdb_delete() -> br_fdb_find() cannot find the
+ * FDB entry, so .port_fdb_del() will not be called.
+ * For NTF_SELF path: dsa_user_netdev_ops does not implement
+ * ndo_fdb_del(), so rtnl_fdb_del() falls back to
+ * ndo_dflt_fdb_del(), which only supports NUD_PERMANENT static
+ * entries and rejects all others with -EINVAL.
*/
goto unlock_fdbt;
@@ -1276,6 +1293,16 @@ static int netc_port_add_vlan_entry(struct netc_port *np, u16 vid,
entry->ect_gid = NTMP_NULL_ENTRY_ID;
bitmap_stg = BIT(index) | VFT_STG_ID(0);
+ /* If the VID is a VLAN-unaware PVID, the CPU port needs to be
+ * a member of this VLAN.
+ */
+ if (dsa_port_is_user(np->dp) &&
+ vid >= NETC_VLAN_UNAWARE_PVID(priv->ds->max_num_bridges)) {
+ struct dsa_port *cpu_dp = np->dp->cpu_dp;
+
+ bitmap_stg |= BIT(cpu_dp->index);
+ }
+
cfg = FIELD_PREP(VFT_MLO, MLO_HW) |
FIELD_PREP(VFT_MFO, MFO_NO_MATCH_FLOOD);
@@ -1313,11 +1340,16 @@ static int netc_port_add_vlan_entry(struct netc_port *np, u16 vid,
return err;
}
-static bool netc_port_vlan_egress_rule_changed(struct netc_vlan_entry *entry,
+static bool netc_port_vlan_egress_rule_changed(struct netc_switch *priv,
+ struct netc_vlan_entry *entry,
int port, bool untagged)
{
bool old_untagged = !!(entry->untagged_port_bitmap & BIT(port));
+ /* VLAN-unaware VIDs have no egress rules, so return 'false' */
+ if (entry->vid >= NETC_VLAN_UNAWARE_PVID(priv->ds->max_num_bridges))
+ return false;
+
return old_untagged != untagged;
}
@@ -1340,7 +1372,8 @@ static int netc_port_set_vlan_entry(struct netc_port *np, u16 vid,
}
/* Check whether the egress VLAN rule is changed */
- changed = netc_port_vlan_egress_rule_changed(entry, port, untagged);
+ changed = netc_port_vlan_egress_rule_changed(priv, entry, port,
+ untagged);
if (changed) {
entry->untagged_port_bitmap ^= BIT(port);
err = netc_port_update_vlan_egress_rule(np, entry);
@@ -1404,6 +1437,17 @@ static int netc_port_del_vlan_entry(struct netc_port *np, u16 vid)
cfge = &entry->cfge;
vlan_port_bitmap = FIELD_GET(VFT_PORT_MEMBERSHIP,
le32_to_cpu(cfge->bitmap_stg));
+ /* If the VID is a VLAN-unaware PVID, we need to clear the CPU
+ * port bit of vlan_port_bitmap, so that the VLAN entry can be
+ * deleted if no user ports use this VLAN.
+ */
+ if (dsa_port_is_user(np->dp) &&
+ vid >= NETC_VLAN_UNAWARE_PVID(priv->ds->max_num_bridges)) {
+ struct dsa_port *cpu_dp = np->dp->cpu_dp;
+
+ vlan_port_bitmap &= ~BIT(cpu_dp->index);
+ }
+
/* If the VLAN only belongs to the current port */
if (vlan_port_bitmap == BIT(port)) {
err = ntmp_vft_delete_entry(&priv->ntmp, vid);
@@ -1509,17 +1553,50 @@ static int netc_port_max_mtu(struct dsa_switch *ds, int port)
return NETC_MAX_FRAME_LEN - VLAN_ETH_HLEN - ETH_FCS_LEN;
}
+static struct net_device *netc_classify_db(struct dsa_db db)
+{
+ switch (db.type) {
+ case DSA_DB_PORT:
+ return NULL;
+ case DSA_DB_BRIDGE:
+ return db.bridge.dev;
+ default:
+ return ERR_PTR(-EOPNOTSUPP);
+ }
+}
+
+static u16 netc_vlan_unaware_pvid(struct dsa_port *dp)
+{
+ u32 br_num;
+
+ if (!dsa_port_bridge_dev_get(dp))
+ return NETC_STANDALONE_PVID;
+
+ br_num = dsa_port_bridge_num_get(dp);
+
+ /* The br_num is supposed to be 1 ~ ds->max_num_bridges, see
+ * dsa_bridge_num_get(). Since max_num_bridges is non-zero,
+ * so dsa_port_bridge_create() will return an error if
+ * dsa_bridge_num_get() returns 0.
+ */
+ if (WARN_ON(!br_num))
+ return NETC_STANDALONE_PVID;
+
+ return NETC_VLAN_UNAWARE_PVID(br_num);
+}
+
static int netc_port_fdb_add(struct dsa_switch *ds, int port,
const unsigned char *addr, u16 vid,
struct dsa_db db)
{
+ struct net_device *br_ndev = netc_classify_db(db);
struct netc_port *np = NETC_PORT(ds, port);
- /* Currently, only support standalone port mode, so only
- * NETC_STANDALONE_PVID (= 0) is supported here.
- */
- if (vid != NETC_STANDALONE_PVID)
- return -EOPNOTSUPP;
+ if (IS_ERR(br_ndev))
+ return PTR_ERR(br_ndev);
+
+ if (!vid)
+ vid = netc_vlan_unaware_pvid(np->dp);
return netc_port_set_fdb_entry(np, addr, vid);
}
@@ -1528,10 +1605,14 @@ static int netc_port_fdb_del(struct dsa_switch *ds, int port,
const unsigned char *addr, u16 vid,
struct dsa_db db)
{
+ struct net_device *br_ndev = netc_classify_db(db);
struct netc_port *np = NETC_PORT(ds, port);
- if (vid != NETC_STANDALONE_PVID)
- return -EOPNOTSUPP;
+ if (IS_ERR(br_ndev))
+ return PTR_ERR(br_ndev);
+
+ if (!vid)
+ vid = netc_vlan_unaware_pvid(np->dp);
return netc_port_del_fdb_entry(np, addr, vid);
}
@@ -1567,6 +1648,8 @@ static int netc_port_fdb_dump(struct dsa_switch *ds, int port,
cfg = le32_to_cpu(cfge->cfg);
is_static = (cfg & FDBT_DYNAMIC) ? false : true;
vid = le16_to_cpu(keye->fid);
+ if (vid >= NETC_VLAN_UNAWARE_PVID(ds->max_num_bridges))
+ vid = 0;
err = cb(keye->mac_addr, vid, is_static, data);
if (err)
@@ -1669,12 +1752,23 @@ static void netc_port_remove_host_flood(struct netc_port *np,
struct ipft_entry_data *host_flood)
{
struct netc_switch *priv = np->switch_priv;
+ bool disable_host_flood = false;
if (!host_flood)
return;
+ if (np->host_flood == host_flood)
+ disable_host_flood = true;
+
ntmp_ipft_delete_entry(&priv->ntmp, host_flood->entry_id);
kfree(host_flood);
+
+ if (disable_host_flood) {
+ np->host_flood = NULL;
+ np->uc = false;
+ np->mc = false;
+ netc_port_wr(np, NETC_PIPFCR, 0);
+ }
}
static void netc_port_set_host_flood(struct dsa_switch *ds, int port,
@@ -1683,6 +1777,17 @@ static void netc_port_set_host_flood(struct dsa_switch *ds, int port,
struct netc_port *np = NETC_PORT(ds, port);
struct ipft_entry_data *old_host_flood;
+ /* Do not add host flood rule to ingress port filter table when
+ * the port has joined a bridge. Otherwise, the ingress frames
+ * will bypass FDB table lookup and MAC learning, so the frames
+ * will be redirected directly to the CPU port.
+ */
+ if (dsa_port_bridge_dev_get(np->dp)) {
+ netc_port_remove_host_flood(np, np->host_flood);
+
+ return;
+ }
+
if (np->uc == uc && np->mc == mc)
return;
@@ -1704,12 +1809,85 @@ static void netc_port_set_host_flood(struct dsa_switch *ds, int port,
netc_port_remove_host_flood(np, old_host_flood);
}
+static int netc_single_vlan_aware_bridge(struct dsa_switch *ds,
+ struct netlink_ext_ack *extack)
+{
+ struct net_device *br_ndev = NULL;
+ struct dsa_port *dp;
+
+ dsa_switch_for_each_available_port(dp, ds) {
+ struct net_device *port_br = dsa_port_bridge_dev_get(dp);
+
+ if (!port_br || !br_vlan_enabled(port_br))
+ continue;
+
+ if (!br_ndev) {
+ br_ndev = port_br;
+ continue;
+ }
+
+ if (br_ndev == port_br)
+ continue;
+
+ NL_SET_ERR_MSG_MOD(extack,
+ "Only one VLAN-aware bridge is supported");
+
+ return -EBUSY;
+ }
+
+ return 0;
+}
+
+static int netc_port_vlan_filtering(struct dsa_switch *ds,
+ int port, bool vlan_aware,
+ struct netlink_ext_ack *extack)
+{
+ struct netc_port *np = NETC_PORT(ds, port);
+ u16 pvid;
+ int err;
+
+ err = netc_single_vlan_aware_bridge(ds, extack);
+ if (err)
+ return err;
+
+ pvid = netc_vlan_unaware_pvid(np->dp);
+ if (pvid == NETC_STANDALONE_PVID) {
+ vlan_aware = false;
+ goto bpdvr_config;
+ }
+
+ if (vlan_aware) {
+ /* The FDB entries associated with unaware_pvid do not need
+ * to be deleted, so that when switching from VLAN-aware to
+ * VLAN-unaware mode, these FDB entries do not need to be
+ * re-added.
+ */
+ err = netc_port_del_vlan_entry(np, pvid);
+ if (err)
+ return err;
+
+ pvid = np->pvid;
+ } else {
+ err = netc_port_set_vlan_entry(np, pvid, false);
+ if (err)
+ return err;
+ }
+
+bpdvr_config:
+ netc_port_set_vlan_aware(np, vlan_aware);
+ netc_port_set_pvid(np, pvid);
+
+ return 0;
+}
+
static int netc_port_vlan_add(struct dsa_switch *ds, int port,
const struct switchdev_obj_port_vlan *vlan,
struct netlink_ext_ack *extack)
{
struct netc_port *np = NETC_PORT(ds, port);
+ struct dsa_port *dp = np->dp;
bool untagged;
+ int err;
/* The 8021q layer may attempt to change NETC_STANDALONE_PVID
* (VID 0), so we need to ignore it.
@@ -1717,20 +1895,157 @@ static int netc_port_vlan_add(struct dsa_switch *ds, int port,
if (vlan->vid == NETC_STANDALONE_PVID)
return 0;
+ if (vlan->vid >= NETC_VLAN_UNAWARE_PVID(ds->max_num_bridges)) {
+ NL_SET_ERR_MSG_FMT_MOD(extack,
+ "VID %d~4095 reserved for VLAN-unaware bridge",
+ NETC_VLAN_UNAWARE_PVID(ds->max_num_bridges));
+ return -EINVAL;
+ }
+
untagged = !!(vlan->flags & BRIDGE_VLAN_INFO_UNTAGGED);
+ err = netc_port_set_vlan_entry(np, vlan->vid, untagged);
+ if (err)
+ return err;
+
+ if (vlan->flags & BRIDGE_VLAN_INFO_PVID) {
+ np->pvid = vlan->vid;
+ if (dsa_port_is_vlan_filtering(dp))
+ netc_port_set_pvid(np, vlan->vid);
+
+ return 0;
+ }
- return netc_port_set_vlan_entry(np, vlan->vid, untagged);
+ if (np->pvid != vlan->vid)
+ return 0;
+
+ /* Delete PVID */
+ np->pvid = NETC_STANDALONE_PVID;
+ if (dsa_port_is_vlan_filtering(dp))
+ netc_port_set_pvid(np, NETC_STANDALONE_PVID);
+
+ return 0;
}
static int netc_port_vlan_del(struct dsa_switch *ds, int port,
const struct switchdev_obj_port_vlan *vlan)
{
struct netc_port *np = NETC_PORT(ds, port);
+ int err;
if (vlan->vid == NETC_STANDALONE_PVID)
return 0;
- return netc_port_del_vlan_entry(np, vlan->vid);
+ err = netc_port_del_vlan_entry(np, vlan->vid);
+ if (err)
+ return err;
+
+ if (np->pvid == vlan->vid) {
+ np->pvid = NETC_STANDALONE_PVID;
+
+ /* Set the port PVID to NETC_STANDALONE_PVID if the VLAN-aware
+ * bridge port has no PVID. The untagged frames will not be
+ * forwarded to other user ports, as NETC_STANDALONE_PVID VLAN
+ * entry has disabled MAC learning and flooding, and other user
+ * ports do not have FDB entries with NETC_STANDALONE_PVID.
+ */
+ if (dsa_port_is_vlan_filtering(np->dp))
+ netc_port_set_pvid(np, NETC_STANDALONE_PVID);
+ }
+
+ return 0;
+}
+
+static int netc_port_bridge_join(struct dsa_switch *ds, int port,
+ struct dsa_bridge bridge,
+ bool *tx_fwd_offload,
+ struct netlink_ext_ack *extack)
+{
+ struct netc_port *np = NETC_PORT(ds, port);
+ u16 vlan_unaware_pvid;
+ int err;
+
+ if (!bridge.num) {
+ NL_SET_ERR_MSG_MOD(extack, "Bridge number 0 is unsupported");
+ return -EINVAL;
+ }
+
+ err = netc_single_vlan_aware_bridge(ds, extack);
+ if (err)
+ return err;
+
+ netc_port_set_mlo(np, MLO_NOT_OVERRIDE);
+
+ if (br_vlan_enabled(bridge.dev))
+ goto out;
+
+ vlan_unaware_pvid = NETC_VLAN_UNAWARE_PVID(bridge.num);
+ err = netc_port_set_vlan_entry(np, vlan_unaware_pvid, false);
+ if (err)
+ goto disable_mlo;
+
+ netc_port_set_pvid(np, vlan_unaware_pvid);
+
+out:
+ netc_port_remove_host_flood(np, np->host_flood);
+
+ return 0;
+
+disable_mlo:
+ netc_port_set_mlo(np, MLO_DISABLE);
+
+ return err;
+}
+
+static void netc_port_remove_dynamic_entries(struct netc_port *np)
+{
+ struct netc_switch *priv = np->switch_priv;
+
+ /* Return if the port is not available */
+ if (!np->dp)
+ return;
+
+ mutex_lock(&priv->fdbt_lock);
+ ntmp_fdbt_delete_port_dynamic_entries(&priv->ntmp, np->dp->index);
+ mutex_unlock(&priv->fdbt_lock);
+}
+
+static void netc_port_bridge_leave(struct dsa_switch *ds, int port,
+ struct dsa_bridge bridge)
+{
+ struct netc_port *np = NETC_PORT(ds, port);
+ struct net_device *ndev = np->dp->user;
+ u16 vlan_unaware_pvid;
+ bool mc, uc;
+
+ netc_port_set_mlo(np, MLO_DISABLE);
+ netc_port_set_pvid(np, NETC_STANDALONE_PVID);
+ np->pvid = NETC_STANDALONE_PVID;
+
+ netc_port_remove_dynamic_entries(np);
+ uc = ndev->flags & IFF_PROMISC;
+ mc = ndev->flags & (IFF_PROMISC | IFF_ALLMULTI);
+
+ if (netc_port_add_host_flood_rule(np, uc, mc))
+ dev_warn(ds->dev,
+ "Failed to restore host flood rule on port %d\n",
+ port);
+
+ if (br_vlan_enabled(bridge.dev))
+ return;
+
+ vlan_unaware_pvid = NETC_VLAN_UNAWARE_PVID(bridge.num);
+ /* There is no need to check the return value even if it fails.
+ * Because the PVID has been set to NETC_STANDALONE_PVID, the
+ * frames will not match this VLAN entry.
+ */
+ netc_port_del_vlan_entry(np, vlan_unaware_pvid);
+}
+
+static void netc_port_fast_age(struct dsa_switch *ds, int port)
+{
+ struct netc_port *np = NETC_PORT(ds, port);
+
+ netc_port_remove_dynamic_entries(np);
}
static void netc_phylink_get_caps(struct dsa_switch *ds, int port,
@@ -1987,6 +2302,7 @@ static void netc_mac_link_down(struct phylink_config *config,
np = NETC_PORT(dp->ds, dp->index);
netc_port_mac_rx_graceful_stop(np);
netc_port_mac_tx_graceful_stop(np);
+ netc_port_remove_dynamic_entries(np);
}
static const struct phylink_mac_ops netc_phylink_mac_ops = {
@@ -2011,8 +2327,12 @@ static const struct dsa_switch_ops netc_switch_ops = {
.port_mdb_add = netc_port_mdb_add,
.port_mdb_del = netc_port_mdb_del,
.port_set_host_flood = netc_port_set_host_flood,
+ .port_vlan_filtering = netc_port_vlan_filtering,
.port_vlan_add = netc_port_vlan_add,
.port_vlan_del = netc_port_vlan_del,
+ .port_bridge_join = netc_port_bridge_join,
+ .port_bridge_leave = netc_port_bridge_leave,
+ .port_fast_age = netc_port_fast_age,
.get_pause_stats = netc_port_get_pause_stats,
.get_rmon_stats = netc_port_get_rmon_stats,
.get_eth_ctrl_stats = netc_port_get_eth_ctrl_stats,
@@ -2060,6 +2380,7 @@ static int netc_switch_probe(struct pci_dev *pdev,
ds->ops = &netc_switch_ops;
ds->phylink_mac_ops = &netc_phylink_mac_ops;
ds->fdb_isolation = true;
+ ds->max_num_bridges = priv->info->num_ports - 1;
ds->priv = priv;
priv->ds = ds;
diff --git a/drivers/net/dsa/netc/netc_switch.h b/drivers/net/dsa/netc/netc_switch.h
index 9ff334301fbc..982c8d3a3fbf 100644
--- a/drivers/net/dsa/netc/netc_switch.h
+++ b/drivers/net/dsa/netc/netc_switch.h
@@ -33,6 +33,7 @@
#define NETC_MAX_FRAME_LEN 9600
#define NETC_STANDALONE_PVID 0
+#define NETC_VLAN_UNAWARE_PVID(br_id) (4096 - (br_id))
/* Threshold format: MANT (bits 11:4) * 2^EXP (bits 3:0)
* Unit: Memory words (average of 20 bytes each)
@@ -79,6 +80,7 @@ struct netc_port {
u16 enable:1;
u16 uc:1;
u16 mc:1;
+ u16 pvid;
struct ipft_entry_data *host_flood;
};
--
2.34.1
^ permalink raw reply related
* [PATCH v3 net-next 9/9] net: dsa: netc: implement dynamic FDB entry ageing
From: wei.fang @ 2026-06-05 1:48 UTC (permalink / raw)
To: claudiu.manoil, vladimir.oltean, xiaoning.wang, andrew+netdev,
davem, edumazet, kuba, pabeni, chleroy, andrew, olteanv, linux
Cc: wei.fang, imx, netdev, linux-kernel, linuxppc-dev,
linux-arm-kernel
In-Reply-To: <20260605014808.686024-1-wei.fang@oss.nxp.com>
From: Wei Fang <wei.fang@nxp.com>
The NETC switch does not age out dynamic FDB entries automatically.
Without software management, stale entries persist after topology
changes and cause incorrect forwarding.
Add a delayed work that periodically removes entries that have not been
refreshed within the specified cycles. The effective ageing time is:
ageing_time = fdbt_ageing_delay * 100
Default values are 3s interval and 100 cycles (300s total), matching
the IEEE 802.1Q default ageing time. The work starts when the first
port joins a bridge (tracked via br_cnt) and is cancelled when the
last port leaves. All FDB operations are serialized under fdbt_lock.
Implement .set_ageing_time() to allow the bridge layer to reconfigure
ageing parameters on demand.
Signed-off-by: Wei Fang <wei.fang@nxp.com>
---
drivers/net/dsa/netc/netc_main.c | 67 ++++++++++++++++++++++++++++++
drivers/net/dsa/netc/netc_switch.h | 7 ++++
2 files changed, 74 insertions(+)
diff --git a/drivers/net/dsa/netc/netc_main.c b/drivers/net/dsa/netc/netc_main.c
index b61152c16e7b..a8e036a7046c 100644
--- a/drivers/net/dsa/netc/netc_main.c
+++ b/drivers/net/dsa/netc/netc_main.c
@@ -447,6 +447,25 @@ static void netc_free_ntmp_user(struct netc_switch *priv)
netc_free_ntmp_bitmaps(priv);
}
+static void netc_clean_fdbt_ageing_entries(struct work_struct *work)
+{
+ struct delayed_work *dwork = to_delayed_work(work);
+ struct netc_switch *priv;
+
+ priv = container_of(dwork, struct netc_switch, fdbt_ageing_work);
+
+ /* Update the activity element in FDB table */
+ mutex_lock(&priv->fdbt_lock);
+ ntmp_fdbt_update_activity_element(&priv->ntmp);
+ /* Delete the ageing entries after the activity element is updated */
+ ntmp_fdbt_delete_ageing_entries(&priv->ntmp, NETC_FDBT_AGEING_THRESH);
+ mutex_unlock(&priv->fdbt_lock);
+
+ if (atomic_read(&priv->br_cnt))
+ schedule_delayed_work(&priv->fdbt_ageing_work,
+ READ_ONCE(priv->fdbt_ageing_delay));
+}
+
static void netc_switch_dos_default_config(struct netc_switch *priv)
{
struct netc_switch_regs *regs = &priv->regs;
@@ -872,6 +891,10 @@ static int netc_setup(struct dsa_switch *ds)
INIT_HLIST_HEAD(&priv->fdb_list);
mutex_init(&priv->fdbt_lock);
+ priv->fdbt_ageing_delay = NETC_FDBT_AGEING_DELAY;
+ atomic_set(&priv->br_cnt, 0);
+ INIT_DELAYED_WORK(&priv->fdbt_ageing_work,
+ netc_clean_fdbt_ageing_entries);
INIT_HLIST_HEAD(&priv->vlan_list);
mutex_init(&priv->vft_lock);
@@ -936,6 +959,7 @@ static void netc_teardown(struct dsa_switch *ds)
{
struct netc_switch *priv = ds->priv;
+ disable_delayed_work_sync(&priv->fdbt_ageing_work);
netc_destroy_all_lists(priv);
netc_free_host_flood_rules(priv);
netc_free_ntmp_user(priv);
@@ -1961,6 +1985,7 @@ static int netc_port_bridge_join(struct dsa_switch *ds, int port,
struct netlink_ext_ack *extack)
{
struct netc_port *np = NETC_PORT(ds, port);
+ struct netc_switch *priv = ds->priv;
u16 vlan_unaware_pvid;
int err;
@@ -1988,6 +2013,10 @@ static int netc_port_bridge_join(struct dsa_switch *ds, int port,
out:
netc_port_remove_host_flood(np, np->host_flood);
+ if (atomic_inc_return(&priv->br_cnt) == 1)
+ schedule_delayed_work(&priv->fdbt_ageing_work,
+ READ_ONCE(priv->fdbt_ageing_delay));
+
return 0;
disable_mlo:
@@ -2014,6 +2043,7 @@ static void netc_port_bridge_leave(struct dsa_switch *ds, int port,
{
struct netc_port *np = NETC_PORT(ds, port);
struct net_device *ndev = np->dp->user;
+ struct netc_switch *priv = ds->priv;
u16 vlan_unaware_pvid;
bool mc, uc;
@@ -2021,6 +2051,9 @@ static void netc_port_bridge_leave(struct dsa_switch *ds, int port,
netc_port_set_pvid(np, NETC_STANDALONE_PVID);
np->pvid = NETC_STANDALONE_PVID;
+ if (atomic_dec_and_test(&priv->br_cnt))
+ cancel_delayed_work_sync(&priv->fdbt_ageing_work);
+
netc_port_remove_dynamic_entries(np);
uc = ndev->flags & IFF_PROMISC;
mc = ndev->flags & (IFF_PROMISC | IFF_ALLMULTI);
@@ -2041,6 +2074,37 @@ static void netc_port_bridge_leave(struct dsa_switch *ds, int port,
netc_port_del_vlan_entry(np, vlan_unaware_pvid);
}
+static int netc_set_ageing_time(struct dsa_switch *ds, unsigned int msecs)
+{
+ struct netc_switch *priv = ds->priv;
+ unsigned long delay_jiffies;
+
+ /* The dynamic FDB entry is deleted when its activity counter reaches
+ * NETC_FDBT_AGEING_THRESH (100). Each delayed_work tick increments
+ * the counter by 1 if the entry is inactive.
+ *
+ * Therefore:
+ * msecs (ms) = NETC_FDBT_AGEING_THRESH * delay_ms (ms)
+ * delay_ms = msecs / NETC_FDBT_AGEING_THRESH
+ * delay_jiffies = (delay_ms / 1000) * HZ
+ * = (msecs * HZ) / (1000 * NETC_FDBT_AGEING_THRESH)
+ *
+ * Use DIV_ROUND_CLOSEST_ULL to perform a single nearest-jiffy
+ * rounding, avoiding the two-step rounding error of the intermediate
+ * delay_ms approach.
+ * Maximum error = +/-0.5 jiffy * 100 = +/-50000/HZ ms.
+ */
+ delay_jiffies = DIV_ROUND_CLOSEST_ULL((u64)msecs * HZ,
+ 1000 * NETC_FDBT_AGEING_THRESH);
+ WRITE_ONCE(priv->fdbt_ageing_delay, delay_jiffies);
+
+ if (atomic_read(&priv->br_cnt))
+ mod_delayed_work(system_percpu_wq, &priv->fdbt_ageing_work,
+ READ_ONCE(priv->fdbt_ageing_delay));
+
+ return 0;
+}
+
static void netc_port_fast_age(struct dsa_switch *ds, int port)
{
struct netc_port *np = NETC_PORT(ds, port);
@@ -2332,6 +2396,7 @@ static const struct dsa_switch_ops netc_switch_ops = {
.port_vlan_del = netc_port_vlan_del,
.port_bridge_join = netc_port_bridge_join,
.port_bridge_leave = netc_port_bridge_leave,
+ .set_ageing_time = netc_set_ageing_time,
.port_fast_age = netc_port_fast_age,
.get_pause_stats = netc_port_get_pause_stats,
.get_rmon_stats = netc_port_get_rmon_stats,
@@ -2381,6 +2446,8 @@ static int netc_switch_probe(struct pci_dev *pdev,
ds->phylink_mac_ops = &netc_phylink_mac_ops;
ds->fdb_isolation = true;
ds->max_num_bridges = priv->info->num_ports - 1;
+ ds->ageing_time_min = 1000;
+ ds->ageing_time_max = U32_MAX;
ds->priv = priv;
priv->ds = ds;
diff --git a/drivers/net/dsa/netc/netc_switch.h b/drivers/net/dsa/netc/netc_switch.h
index 982c8d3a3fbf..305f2a92e2f9 100644
--- a/drivers/net/dsa/netc/netc_switch.h
+++ b/drivers/net/dsa/netc/netc_switch.h
@@ -50,6 +50,9 @@
/* PAUSE refresh threshold: send refresh when timer reaches this value */
#define NETC_PAUSE_THRESH 0x7FFF
+#define NETC_FDBT_AGEING_DELAY (3 * HZ)
+#define NETC_FDBT_AGEING_THRESH 100
+
struct netc_switch;
struct netc_switch_info {
@@ -124,6 +127,10 @@ struct netc_switch {
struct ntmp_user ntmp;
struct hlist_head fdb_list;
struct mutex fdbt_lock; /* FDB table lock */
+ struct delayed_work fdbt_ageing_work;
+ /* (fdbt_ageing_delay * NETC_FDBT_AGEING_THRESH) is ageing time */
+ unsigned long fdbt_ageing_delay;
+ atomic_t br_cnt;
struct hlist_head vlan_list;
struct mutex vft_lock; /* VLAN filter table lock */
--
2.34.1
^ permalink raw reply related
* [RFC 00/12] PCI: Add support for Scalable I/O Virtualization
From: Dimitri Daskalakis @ 2026-06-04 15:01 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: linux-pci, Madhavan Srinivasan, Michael Ellerman, Nicholas Piggin,
Christophe Leroy, Mahesh J Salgaonkar, Oliver O'Halloran,
Niklas Schnelle, Gerald Schaefer, Heiko Carstens, Vasily Gorbik,
Alexander Gordeev, Christian Borntraeger, Sven Schnelle,
Alex Williamson, Jason Gunthorpe, Kevin Tian, Ankit Agrawal,
Leon Romanovsky, Juergen Gross, Stefano Stabellini,
Oleksandr Tyshchenko, Keith Busch, Alexander Duyck,
Jakub Kicinski, Dimitri Daskalakis, linuxppc-dev, linux-s390, kvm,
xen-devel
From: Dimitri Daskalakis <daskald@meta.com>
Scalable I/O Virtualization (SIOV) is the next-generation alternative
to SR-IOV. The goal of SIOV is to support more virtual devices than SR-IOV
can currently support, while relaxing many of the HW requirements of SR-IOV.
SIOV VFs are referred to as Scalable Device Interfaces (SDI). An SDI has
a unique PCIe Routing ID (RID), but has no configuration space, BAR,
or MSI-X table.
An overview can be found here:
https://pcisig.com/PCIExpress/ECN/Base/ScalableIOVirtualization
Since SDIs lack HW support, the complexity will fall on software
(hypervisors, vmms, drivers, and/or firmware) to provide the same isolation
guarantees for SIOV that SR-IOV has today.
This patch series is one step in that direction, allowing the PCI subsystem
to discover the SIOV capability during enumeration. This was the minimum set
of changes needed so I could test the SIOV feature of developmental HW in
emulation. I have not tested a device that supports both SR-IOV and SIOV,
but this combination is allowed per the spec.
SIOV has two ways to assign RIDs, strided (like SR-IOV) or software assigned.
To support software RID assignment, you need to compute the RID allowlist
after all PCI devices have been enumerated. I've deferred this complexity
for now and only implemented strided RID assignment.
Patch 1 adds helpers to identify if a PF/VF is a SR-IOV PF/VF. The PF and
VF bits within struct pci_dev should be agnostic of virtualization type.
The helper uses the current logic which assumes any PF/VF is SR-IOV.
Patch 2-7 uses the new helpers throughout core. I didn't convert certain
device drivers (drivers/net, drivers/gpu) because the devices will not
suddenly start advertising the SIOV capability. These can be updated in
the future if desired.
Patch 8 tightens the helpers introduced in patch 1 with a new is_sriov bit.
Patch 9 is a small refactor for computing VF RID which can be shared
between SR-IOV and SIOV.
Patch 10-12 add SIOV definitions, capability detection, and bus reservation.
With this patchset core enumarates the SIOV capability and can identify
SIOV PFs. But there is no central mechanism to allocate/manage SIOV VFs.
To support device pass through, devices will need to add a vfio-mdev
driver with IOMMUFD support (or something similar).
Dimitri Daskalakis (12):
PCI: Add helpers to identify SR-IOV PFs/VFs.
PCI: Convert iov.c to pci_is_sriov_* helpers
PCI: Convert pci.h to pci_is_sriov_* helpers
PCI: Convert arch/powerpc to pci_is_sriov_* helpers
PCI: Convert s390/pci/pci.c to pci_is_sriov_* helpers
PCI: Convert vfio_pci_core.c to pci_is_sriov_* helpers
PCI: Convert xen-pciback and pci-driver to pci_is_sriov_* helpers
PCI: Add is_sriov bit to struct pci_dev
PCI: Add helper to compute VF Routing ID to pci.h
PCI: Add Scalable I/O Virtualization data structure definitions
PCI: Initialize and release SIOV capability
PCI: Reserve bus range for SIOV devices
arch/powerpc/kernel/pci_dn.c | 4 +-
arch/powerpc/platforms/powernv/pci-ioda.c | 6 +-
arch/powerpc/platforms/powernv/pci-sriov.c | 10 +-
arch/powerpc/platforms/pseries/eeh_pseries.c | 8 +-
arch/powerpc/platforms/pseries/setup.c | 4 +-
arch/s390/pci/pci.c | 2 +-
arch/s390/pci/pci_iov.c | 1 +
drivers/pci/Kconfig | 11 ++
drivers/pci/Makefile | 1 +
drivers/pci/iov.c | 58 ++++----
drivers/pci/pci-driver.c | 4 +-
drivers/pci/pci.h | 43 +++++-
drivers/pci/probe.c | 6 +-
drivers/pci/siov.c | 134 +++++++++++++++++++
drivers/vfio/pci/vfio_pci_core.c | 12 +-
drivers/xen/xen-pciback/pci_stub.c | 2 +-
include/linux/pci.h | 29 +++-
include/uapi/linux/pci_regs.h | 12 +-
18 files changed, 289 insertions(+), 58 deletions(-)
create mode 100644 drivers/pci/siov.c
--
2.52.0
^ permalink raw reply
* [RFC 01/12] PCI: Add helpers to identify SR-IOV PFs/VFs.
From: Dimitri Daskalakis @ 2026-06-04 15:01 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: linux-pci, Madhavan Srinivasan, Michael Ellerman, Nicholas Piggin,
Christophe Leroy, Mahesh J Salgaonkar, Oliver O'Halloran,
Niklas Schnelle, Gerald Schaefer, Heiko Carstens, Vasily Gorbik,
Alexander Gordeev, Christian Borntraeger, Sven Schnelle,
Alex Williamson, Jason Gunthorpe, Kevin Tian, Ankit Agrawal,
Leon Romanovsky, Juergen Gross, Stefano Stabellini,
Oleksandr Tyshchenko, Keith Busch, Alexander Duyck,
Jakub Kicinski, Dimitri Daskalakis, linuxppc-dev, linux-s390, kvm,
xen-devel
In-Reply-To: <20260604150153.3619662-1-dimitri.daskalakis1@gmail.com>
From: Dimitri Daskalakis <daskald@meta.com>
Throughout core the pci_dev attribute is_physfn is used to determine
whether or not a PF has SR-IOV active. And is_virtfn is used to check
if a device is a SR-IOV VF.
These attributes should be generalized to represent PFs/VFs for any type
of virtualization. So in preparation, wrap the existing usage in
helpers and use helpers in subsequent patches to ease the transition.
Assisted-by: Claude:claude-opus-4.7
Signed-off-by: Dimitri Daskalakis <daskald@meta.com>
---
include/linux/pci.h | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 2c4454583c11..28892243f49f 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -604,6 +604,16 @@ static inline struct pci_dev *pci_physfn(struct pci_dev *dev)
return dev;
}
+static inline bool pci_is_sriov_physfn(const struct pci_dev *dev)
+{
+ return dev->is_physfn;
+}
+
+static inline bool pci_is_sriov_virtfn(const struct pci_dev *dev)
+{
+ return dev->is_virtfn;
+}
+
struct pci_dev *pci_alloc_dev(struct pci_bus *bus);
#define to_pci_dev(n) container_of(n, struct pci_dev, dev)
@@ -1277,6 +1287,7 @@ void pcibios_setup_bridge(struct pci_bus *bus, unsigned long type);
void pci_sort_breadthfirst(void);
#define dev_is_pci(d) ((d)->bus == &pci_bus_type)
#define dev_is_pf(d) ((dev_is_pci(d) ? to_pci_dev(d)->is_physfn : false))
+#define dev_is_sriov_pf(d) ((dev_is_pci(d) ? pci_is_sriov_physfn(to_pci_dev(d)) : false))
/* Generic PCI functions exported to card drivers */
@@ -2207,6 +2218,7 @@ static inline struct pci_dev *pci_dev_get(struct pci_dev *dev) { return NULL; }
#define dev_is_pci(d) (false)
#define dev_is_pf(d) (false)
+#define dev_is_sriov_pf(d) (false)
static inline bool pci_acs_enabled(struct pci_dev *pdev, u16 acs_flags)
{ return false; }
static inline int pci_irqd_intx_xlate(struct irq_domain *d,
--
2.52.0
^ permalink raw reply related
* [RFC 02/12] PCI: Convert iov.c to pci_is_sriov_* helpers
From: Dimitri Daskalakis @ 2026-06-04 15:01 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: linux-pci, Madhavan Srinivasan, Michael Ellerman, Nicholas Piggin,
Christophe Leroy, Mahesh J Salgaonkar, Oliver O'Halloran,
Niklas Schnelle, Gerald Schaefer, Heiko Carstens, Vasily Gorbik,
Alexander Gordeev, Christian Borntraeger, Sven Schnelle,
Alex Williamson, Jason Gunthorpe, Kevin Tian, Ankit Agrawal,
Leon Romanovsky, Juergen Gross, Stefano Stabellini,
Oleksandr Tyshchenko, Keith Busch, Alexander Duyck,
Jakub Kicinski, Dimitri Daskalakis, linuxppc-dev, linux-s390, kvm,
xen-devel
In-Reply-To: <20260604150153.3619662-1-dimitri.daskalakis1@gmail.com>
From: Dimitri Daskalakis <daskald@meta.com>
No functional changes.
Assisted-by: Claude:claude-opus-4.7
Signed-off-by: Dimitri Daskalakis <daskald@meta.com>
---
drivers/pci/iov.c | 46 +++++++++++++++++++++++-----------------------
1 file changed, 23 insertions(+), 23 deletions(-)
diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
index 91ac4e37ecb9..5de26057b99a 100644
--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -23,7 +23,7 @@
int pci_iov_virtfn_bus(struct pci_dev *dev, int vf_id)
{
- if (!dev->is_physfn)
+ if (!pci_is_sriov_physfn(dev))
return -EINVAL;
return dev->bus->number + ((dev->devfn + dev->sriov->offset +
dev->sriov->stride * vf_id) >> 8);
@@ -31,7 +31,7 @@ int pci_iov_virtfn_bus(struct pci_dev *dev, int vf_id)
int pci_iov_virtfn_devfn(struct pci_dev *dev, int vf_id)
{
- if (!dev->is_physfn)
+ if (!pci_is_sriov_physfn(dev))
return -EINVAL;
return (dev->devfn + dev->sriov->offset +
dev->sriov->stride * vf_id) & 0xff;
@@ -42,7 +42,7 @@ int pci_iov_vf_id(struct pci_dev *dev)
{
struct pci_dev *pf;
- if (!dev->is_virtfn)
+ if (!pci_is_sriov_virtfn(dev))
return -EINVAL;
pf = pci_physfn(dev);
@@ -71,7 +71,7 @@ void *pci_iov_get_pf_drvdata(struct pci_dev *dev, struct pci_driver *pf_driver)
{
struct pci_dev *pf_dev;
- if (!dev->is_virtfn)
+ if (!pci_is_sriov_virtfn(dev))
return ERR_PTR(-EINVAL);
pf_dev = dev->physfn;
if (pf_dev->driver != pf_driver)
@@ -152,7 +152,7 @@ static void virtfn_remove_bus(struct pci_bus *physbus, struct pci_bus *virtbus)
resource_size_t pci_iov_resource_size(struct pci_dev *dev, int resno)
{
- if (!dev->is_physfn)
+ if (!pci_is_sriov_physfn(dev))
return 0;
return dev->sriov->barsz[pci_resource_num_to_vf_bar(resno)];
@@ -300,7 +300,7 @@ static umode_t sriov_vf_attrs_are_visible(struct kobject *kobj,
struct device *dev = kobj_to_dev(kobj);
struct pci_dev *pdev = to_pci_dev(dev);
- if (!pdev->is_virtfn)
+ if (!pci_is_sriov_virtfn(pdev))
return 0;
return a->mode;
@@ -604,7 +604,7 @@ static umode_t sriov_pf_attrs_are_visible(struct kobject *kobj,
{
struct device *dev = kobj_to_dev(kobj);
- if (!dev_is_pf(dev))
+ if (!dev_is_sriov_pf(dev))
return 0;
return a->mode;
@@ -707,7 +707,7 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
if (!pdev)
return -ENODEV;
- if (!pdev->is_physfn) {
+ if (!pci_is_sriov_physfn(pdev)) {
pci_dev_put(pdev);
return -ENOSYS;
}
@@ -814,7 +814,7 @@ static int sriov_init(struct pci_dev *dev, int pos)
ctrl = 0;
list_for_each_entry(pdev, &dev->bus->devices, bus_list)
- if (pdev->is_physfn)
+ if (pci_is_sriov_physfn(pdev))
goto found;
pdev = NULL;
@@ -1006,7 +1006,7 @@ int pci_iov_init(struct pci_dev *dev)
*/
void pci_iov_release(struct pci_dev *dev)
{
- if (dev->is_physfn)
+ if (pci_is_sriov_physfn(dev))
sriov_release(dev);
}
@@ -1018,7 +1018,7 @@ void pci_iov_remove(struct pci_dev *dev)
{
struct pci_sriov *iov = dev->sriov;
- if (!dev->is_physfn)
+ if (!pci_is_sriov_physfn(dev))
return;
iov->driver_max_VFs = iov->total_VFs;
@@ -1035,7 +1035,7 @@ void pci_iov_remove(struct pci_dev *dev)
*/
void pci_iov_update_resource(struct pci_dev *dev, int resno)
{
- struct pci_sriov *iov = dev->is_physfn ? dev->sriov : NULL;
+ struct pci_sriov *iov = pci_is_sriov_physfn(dev) ? dev->sriov : NULL;
struct resource *res = pci_resource_n(dev, resno);
int vf_bar = pci_resource_num_to_vf_bar(resno);
struct pci_bus_region region;
@@ -1111,7 +1111,7 @@ resource_size_t pci_sriov_resource_alignment(struct pci_dev *dev, int resno)
*/
void pci_restore_iov_state(struct pci_dev *dev)
{
- if (dev->is_physfn) {
+ if (pci_is_sriov_physfn(dev)) {
sriov_restore_vf_rebar_state(dev);
sriov_restore_state(dev);
}
@@ -1124,7 +1124,7 @@ void pci_restore_iov_state(struct pci_dev *dev)
*/
void pci_vf_drivers_autoprobe(struct pci_dev *dev, bool auto_probe)
{
- if (dev->is_physfn)
+ if (pci_is_sriov_physfn(dev))
dev->sriov->drivers_autoprobe = auto_probe;
}
@@ -1141,7 +1141,7 @@ int pci_iov_bus_range(struct pci_bus *bus)
struct pci_dev *dev;
list_for_each_entry(dev, &bus->devices, bus_list) {
- if (!dev->is_physfn)
+ if (!pci_is_sriov_physfn(dev))
continue;
if (dev->sriov->max_VF_buses > max)
max = dev->sriov->max_VF_buses;
@@ -1161,7 +1161,7 @@ int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn)
{
might_sleep();
- if (!dev->is_physfn)
+ if (!pci_is_sriov_physfn(dev))
return -ENOSYS;
return sriov_enable(dev, nr_virtfn);
@@ -1176,7 +1176,7 @@ void pci_disable_sriov(struct pci_dev *dev)
{
might_sleep();
- if (!dev->is_physfn)
+ if (!pci_is_sriov_physfn(dev))
return;
sriov_disable(dev);
@@ -1191,7 +1191,7 @@ EXPORT_SYMBOL_GPL(pci_disable_sriov);
*/
int pci_num_vf(struct pci_dev *dev)
{
- if (!dev->is_physfn)
+ if (!pci_is_sriov_physfn(dev))
return 0;
return dev->sriov->num_VFs;
@@ -1212,7 +1212,7 @@ int pci_vfs_assigned(struct pci_dev *dev)
unsigned short dev_id;
/* only search if we are a PF */
- if (!dev->is_physfn)
+ if (!pci_is_sriov_physfn(dev))
return 0;
/*
@@ -1228,7 +1228,7 @@ int pci_vfs_assigned(struct pci_dev *dev)
* It is considered assigned if it is a virtual function with
* our dev as the physical function and the assigned bit is set
*/
- if (vfdev->is_virtfn && (vfdev->physfn == dev) &&
+ if (pci_is_sriov_virtfn(vfdev) && (vfdev->physfn == dev) &&
pci_is_dev_assigned(vfdev))
vfs_assigned++;
@@ -1254,7 +1254,7 @@ EXPORT_SYMBOL_GPL(pci_vfs_assigned);
*/
int pci_sriov_set_totalvfs(struct pci_dev *dev, u16 numvfs)
{
- if (!dev->is_physfn)
+ if (!pci_is_sriov_physfn(dev))
return -ENOSYS;
if (numvfs > dev->sriov->total_VFs)
@@ -1279,7 +1279,7 @@ EXPORT_SYMBOL_GPL(pci_sriov_set_totalvfs);
*/
int pci_sriov_get_totalvfs(struct pci_dev *dev)
{
- if (!dev->is_physfn)
+ if (!pci_is_sriov_physfn(dev))
return 0;
return dev->sriov->driver_max_VFs;
@@ -1301,7 +1301,7 @@ int pci_sriov_configure_simple(struct pci_dev *dev, int nr_virtfn)
might_sleep();
- if (!dev->is_physfn)
+ if (!pci_is_sriov_physfn(dev))
return -ENODEV;
if (pci_vfs_assigned(dev)) {
--
2.52.0
^ permalink raw reply related
* [RFC 03/12] PCI: Convert pci.h to pci_is_sriov_* helpers
From: Dimitri Daskalakis @ 2026-06-04 15:01 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: linux-pci, Madhavan Srinivasan, Michael Ellerman, Nicholas Piggin,
Christophe Leroy, Mahesh J Salgaonkar, Oliver O'Halloran,
Niklas Schnelle, Gerald Schaefer, Heiko Carstens, Vasily Gorbik,
Alexander Gordeev, Christian Borntraeger, Sven Schnelle,
Alex Williamson, Jason Gunthorpe, Kevin Tian, Ankit Agrawal,
Leon Romanovsky, Juergen Gross, Stefano Stabellini,
Oleksandr Tyshchenko, Keith Busch, Alexander Duyck,
Jakub Kicinski, Dimitri Daskalakis, linuxppc-dev, linux-s390, kvm,
xen-devel
In-Reply-To: <20260604150153.3619662-1-dimitri.daskalakis1@gmail.com>
From: Dimitri Daskalakis <daskald@meta.com>
No functional changes.
Assisted-by: Claude:claude-opus-4.7
Signed-off-by: Dimitri Daskalakis <daskald@meta.com>
---
drivers/pci/pci.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 4a14f88e543a..73b913bcb87a 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -954,7 +954,7 @@ void pci_iov_resource_set_size(struct pci_dev *dev, int resno, int size);
bool pci_iov_is_memory_decoding_enabled(struct pci_dev *dev);
static inline u16 pci_iov_vf_rebar_cap(struct pci_dev *dev)
{
- if (!dev->is_physfn)
+ if (!pci_is_sriov_physfn(dev))
return 0;
return dev->sriov->vf_rebar_cap;
--
2.52.0
^ permalink raw reply related
* [RFC 04/12] PCI: Convert arch/powerpc to pci_is_sriov_* helpers
From: Dimitri Daskalakis @ 2026-06-04 15:01 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: linux-pci, Madhavan Srinivasan, Michael Ellerman, Nicholas Piggin,
Christophe Leroy, Mahesh J Salgaonkar, Oliver O'Halloran,
Niklas Schnelle, Gerald Schaefer, Heiko Carstens, Vasily Gorbik,
Alexander Gordeev, Christian Borntraeger, Sven Schnelle,
Alex Williamson, Jason Gunthorpe, Kevin Tian, Ankit Agrawal,
Leon Romanovsky, Juergen Gross, Stefano Stabellini,
Oleksandr Tyshchenko, Keith Busch, Alexander Duyck,
Jakub Kicinski, Dimitri Daskalakis, linuxppc-dev, linux-s390, kvm,
xen-devel
In-Reply-To: <20260604150153.3619662-1-dimitri.daskalakis1@gmail.com>
From: Dimitri Daskalakis <daskald@meta.com>
Convert SR-IOV-specific is_physfn / is_virtfn reads in the PowerPC
PCI code to use pci_is_sriov_physfn() / pci_is_sriov_virtfn(). These
call sites are all SR-IOV-specific: they guard SR-IOV state
dereferences, VF PE management, or sit inside #ifdef CONFIG_PCI_IOV
blocks. Converting them keeps SR-IOV semantics intact once is_physfn
and is_virtfn widen to cover any virtualization type.
Files touched:
arch/powerpc/kernel/pci_dn.c
arch/powerpc/platforms/powernv/pci-ioda.c
arch/powerpc/platforms/powernv/pci-sriov.c
arch/powerpc/platforms/pseries/eeh_pseries.c
arch/powerpc/platforms/pseries/setup.c
No functional changes.
Assisted-by: Claude:claude-opus-4.7
Signed-off-by: Dimitri Daskalakis <daskald@meta.com>
---
arch/powerpc/kernel/pci_dn.c | 4 ++--
arch/powerpc/platforms/powernv/pci-ioda.c | 6 +++---
arch/powerpc/platforms/powernv/pci-sriov.c | 10 +++++-----
arch/powerpc/platforms/pseries/eeh_pseries.c | 8 ++++----
arch/powerpc/platforms/pseries/setup.c | 4 ++--
5 files changed, 16 insertions(+), 16 deletions(-)
diff --git a/arch/powerpc/kernel/pci_dn.c b/arch/powerpc/kernel/pci_dn.c
index a7b664befed2..cf44ec368a36 100644
--- a/arch/powerpc/kernel/pci_dn.c
+++ b/arch/powerpc/kernel/pci_dn.c
@@ -176,7 +176,7 @@ struct pci_dn *add_sriov_vf_pdns(struct pci_dev *pdev)
int i;
/* Only support IOV for now */
- if (WARN_ON(!pdev->is_physfn))
+ if (WARN_ON(!pci_is_sriov_physfn(pdev)))
return NULL;
/* Check if VFs have been populated */
@@ -221,7 +221,7 @@ void remove_sriov_vf_pdns(struct pci_dev *pdev)
int i;
/* Only support IOV PF for now */
- if (WARN_ON(!pdev->is_physfn))
+ if (WARN_ON(!pci_is_sriov_physfn(pdev)))
return;
/* Check if VFs have been populated */
diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 32ecbc46e74b..63eacc8001fe 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -987,7 +987,7 @@ static void pnv_pci_ioda_dma_dev_setup(struct pci_dev *pdev)
pe = pnv_pci_bdfn_to_pe(phb, pci_dev_id(pdev));
if (!pe) {
/* VF PEs should be pre-configured in pnv_pci_sriov_enable() */
- if (WARN_ON(pdev->is_virtfn))
+ if (WARN_ON(pci_is_sriov_virtfn(pdev)))
return;
pnv_pci_configure_bus(pdev->bus);
@@ -2379,7 +2379,7 @@ static void pnv_pci_release_device(struct pci_dev *pdev)
struct pnv_ioda_pe *pe;
/* The VF PE state is torn down when sriov_disable() is called */
- if (pdev->is_virtfn)
+ if (pci_is_sriov_virtfn(pdev))
return;
if (!pdn || pdn->pe_number == IODA_INVALID_PE)
@@ -2391,7 +2391,7 @@ static void pnv_pci_release_device(struct pci_dev *pdev)
* the iov state at probe time since we need to fiddle with the IOV
* resources.
*/
- if (pdev->is_physfn)
+ if (pci_is_sriov_physfn(pdev))
kfree(pdev->dev.archdata.iov_data);
#endif
diff --git a/arch/powerpc/platforms/powernv/pci-sriov.c b/arch/powerpc/platforms/powernv/pci-sriov.c
index 7105a573aec4..1113488f4372 100644
--- a/arch/powerpc/platforms/powernv/pci-sriov.c
+++ b/arch/powerpc/platforms/powernv/pci-sriov.c
@@ -225,7 +225,7 @@ static void pnv_pci_ioda_fixup_iov_resources(struct pci_dev *pdev)
void pnv_pci_ioda_fixup_iov(struct pci_dev *pdev)
{
- if (pdev->is_virtfn) {
+ if (pci_is_sriov_virtfn(pdev)) {
struct pnv_ioda_pe *pe = pnv_ioda_get_pe(pdev);
/*
@@ -235,7 +235,7 @@ void pnv_pci_ioda_fixup_iov(struct pci_dev *pdev)
*/
pe->pdev = pdev;
WARN_ON(!(pe->flags & PNV_IODA_PE_VF));
- } else if (pdev->is_physfn) {
+ } else if (pci_is_sriov_physfn(pdev)) {
/*
* For PFs adjust their allocated IOV resources to match what
* the PHB can support using its M64 BAR table.
@@ -479,7 +479,7 @@ static void pnv_ioda_release_vf_PE(struct pci_dev *pdev)
phb = pci_bus_to_pnvhb(pdev->bus);
- if (!pdev->is_physfn)
+ if (!pci_is_sriov_physfn(pdev))
return;
/* FIXME: Use pnv_ioda_release_pe()? */
@@ -508,7 +508,7 @@ static int pnv_pci_vf_resource_shift(struct pci_dev *dev, int offset)
u16 num_vfs;
int i;
- if (!dev->is_physfn)
+ if (!pci_is_sriov_physfn(dev))
return -EINVAL;
iov = pnv_iov_get(dev);
@@ -620,7 +620,7 @@ static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs)
struct pnv_iov_data *iov;
struct pci_dn *pdn;
- if (!pdev->is_physfn)
+ if (!pci_is_sriov_physfn(pdev))
return;
phb = pci_bus_to_pnvhb(pdev->bus);
diff --git a/arch/powerpc/platforms/pseries/eeh_pseries.c b/arch/powerpc/platforms/pseries/eeh_pseries.c
index b12ef382fec7..32030ac9be51 100644
--- a/arch/powerpc/platforms/pseries/eeh_pseries.c
+++ b/arch/powerpc/platforms/pseries/eeh_pseries.c
@@ -54,7 +54,7 @@ static void pseries_pcibios_bus_add_device(struct pci_dev *pdev)
dev_dbg(&pdev->dev, "EEH: Setting up device\n");
#ifdef CONFIG_PCI_IOV
- if (pdev->is_virtfn) {
+ if (pci_is_sriov_virtfn(pdev)) {
pdn->device_id = pdev->device;
pdn->vendor_id = pdev->vendor;
pdn->class_code = pdev->class;
@@ -68,7 +68,7 @@ static void pseries_pcibios_bus_add_device(struct pci_dev *pdev)
#endif
pseries_eeh_init_edev(pdn);
#ifdef CONFIG_PCI_IOV
- if (pdev->is_virtfn) {
+ if (pci_is_sriov_virtfn(pdev)) {
/*
* FIXME: This really should be handled by choosing the right
* parent PE in pseries_eeh_init_edev().
@@ -731,7 +731,7 @@ static int pseries_call_allow_unfreeze(struct eeh_dev *edev)
if (!vf_pe_array)
return -ENOMEM;
if (pci_num_vf(edev->physfn ? edev->physfn : edev->pdev)) {
- if (edev->pdev->is_physfn) {
+ if (pci_is_sriov_physfn(edev->pdev)) {
cur_vfs = pci_num_vf(edev->pdev);
pdn = eeh_dev_to_pdn(edev);
parent = pdn->parent;
@@ -779,7 +779,7 @@ static int pseries_notify_resume(struct eeh_dev *edev)
if (rtas_function_token(RTAS_FN_IBM_OPEN_SRIOV_ALLOW_UNFREEZE) == RTAS_UNKNOWN_SERVICE)
return -EINVAL;
- if (edev->pdev->is_physfn || edev->pdev->is_virtfn)
+ if (pci_is_sriov_physfn(edev->pdev) || pci_is_sriov_virtfn(edev->pdev))
return pseries_call_allow_unfreeze(edev);
return 0;
diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c
index 50b26ed8432d..8165ae9adbd6 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -778,7 +778,7 @@ static void pseries_pci_fixup_iov_resources(struct pci_dev *pdev)
const int *indexes;
struct device_node *dn = pci_device_to_OF_node(pdev);
- if (!pdev->is_physfn)
+ if (!pci_is_sriov_physfn(pdev))
return;
/*Firmware must support open sriov otherwise don't configure*/
indexes = of_get_property(dn, "ibm,open-sriov-vf-bar-info", NULL);
@@ -799,7 +799,7 @@ static resource_size_t pseries_pci_iov_resource_alignment(struct pci_dev *pdev,
if (!reg)
return pci_iov_resource_size(pdev, resno);
- if (!pdev->is_physfn)
+ if (!pci_is_sriov_physfn(pdev))
return 0;
return pseries_get_iov_fw_value(pdev,
resno - PCI_IOV_RESOURCES,
--
2.52.0
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox