* [PATCH v2 01/10] cpumask: add cpumask_any_and_but()
From: Dawei Li @ 2024-04-03 12:51 UTC (permalink / raw)
To: will, mark.rutland, yury.norov, linux
Cc: xueshuai, renyu.zj, yangyicong, jonathan.cameron, andersson,
konrad.dybcio, linux-arm-kernel, linux-kernel, linux-arm-msm,
Thomas Gleixner, Andrew Morton, Peter Zijlstra, Rusty Russell,
Dawei Li
In-Reply-To: <20240403125109.2054881-1-dawei.li@shingroup.cn>
From: Mark Rutland <mark.rutland@arm.com>
In some cases, it's useful to be able to select a random cpu from the
intersection of two masks, excluding a particular CPU.
For example, in some systems an uncore PMU is shared by a subset of
CPUs, and management of this PMU is assigned to some arbitrary CPU in
this set. Whenever the management CPU is hotplugged out, we wish to
migrate responsibility to another arbitrary CPU which is both in this
set and online.
Today we can use cpumask_any_and() to select an arbitrary CPU in the
intersection of two masks. We can also use cpumask_any_but() to select
any arbitrary cpu in a mask excluding, a particular CPU.
To do both, we either need to use a temporary cpumask, which is
wasteful, or use some lower-level cpumask helpers, which can be unclear.
This patch adds a new cpumask_any_and_but() to cater for these cases.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Dawei Li <dawei.li@shingroup.cn>
---
include/linux/cpumask.h | 23 +++++++++++++++++++++++
1 file changed, 23 insertions(+)
diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
index 1c29947db848..121f3ac757ff 100644
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -388,6 +388,29 @@ unsigned int cpumask_any_but(const struct cpumask *mask, unsigned int cpu)
return i;
}
+/**
+ * cpumask_any_and_but - pick a "random" cpu from *mask1 & *mask2, but not this one.
+ * @mask1: the first input cpumask
+ * @mask2: the second input cpumask
+ * @cpu: the cpu to ignore
+ *
+ * Returns >= nr_cpu_ids if no cpus set.
+ */
+static inline
+unsigned int cpumask_any_and_but(const struct cpumask *mask1,
+ const struct cpumask *mask2,
+ unsigned int cpu)
+{
+ unsigned int i;
+
+ cpumask_check(cpu);
+ i = cpumask_first_and(mask1, mask2);
+ if (i != cpu)
+ return i;
+
+ return cpumask_next_and(cpu, mask1, mask2);
+}
+
/**
* cpumask_nth - get the Nth cpu in a cpumask
* @srcp: the cpumask pointer
--
2.27.0
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related
* [PATCH v2 05/10] perf/arm_dsu: Avoid placing cpumask var on stack
From: Dawei Li @ 2024-04-03 12:51 UTC (permalink / raw)
To: will, mark.rutland, yury.norov, linux
Cc: xueshuai, renyu.zj, yangyicong, jonathan.cameron, andersson,
konrad.dybcio, linux-arm-kernel, linux-kernel, linux-arm-msm,
Dawei Li
In-Reply-To: <20240403125109.2054881-1-dawei.li@shingroup.cn>
For CONFIG_CPUMASK_OFFSTACK=y kernel, explicit allocation of cpumask
variable on stack is not recommended since it can cause potential stack
overflow.
Instead, kernel code should always use *cpumask_var API(s) to allocate
cpumask var in config-neutral way, leaving allocation strategy to
CONFIG_CPUMASK_OFFSTACK.
But dynamic allocation in cpuhp's teardown callback is somewhat problematic
for if allocation fails(which is unlikely but still possible):
- If -ENOMEM is returned to caller, kernel crashes for non-bringup
teardown;
- If callback pretends nothing happened and returns 0 to caller, it may
trap system into an in-consisitent/compromised state;
Use newly-introduced cpumask_any_and_but() to address all issues above.
It eliminates usage of temporary cpumask var in generic way, no matter how
the cpumask var is allocated.
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Dawei Li <dawei.li@shingroup.cn>
---
drivers/perf/arm_dsu_pmu.c | 19 ++++++-------------
1 file changed, 6 insertions(+), 13 deletions(-)
diff --git a/drivers/perf/arm_dsu_pmu.c b/drivers/perf/arm_dsu_pmu.c
index bae3ca37f846..adc0bbb5fafe 100644
--- a/drivers/perf/arm_dsu_pmu.c
+++ b/drivers/perf/arm_dsu_pmu.c
@@ -230,15 +230,6 @@ static const struct attribute_group *dsu_pmu_attr_groups[] = {
NULL,
};
-static int dsu_pmu_get_online_cpu_any_but(struct dsu_pmu *dsu_pmu, int cpu)
-{
- struct cpumask online_supported;
-
- cpumask_and(&online_supported,
- &dsu_pmu->associated_cpus, cpu_online_mask);
- return cpumask_any_but(&online_supported, cpu);
-}
-
static inline bool dsu_pmu_counter_valid(struct dsu_pmu *dsu_pmu, u32 idx)
{
return (idx < dsu_pmu->num_counters) ||
@@ -827,14 +818,16 @@ static int dsu_pmu_cpu_online(unsigned int cpu, struct hlist_node *node)
static int dsu_pmu_cpu_teardown(unsigned int cpu, struct hlist_node *node)
{
- int dst;
- struct dsu_pmu *dsu_pmu = hlist_entry_safe(node, struct dsu_pmu,
- cpuhp_node);
+ struct dsu_pmu *dsu_pmu;
+ unsigned int dst;
+
+ dsu_pmu = hlist_entry_safe(node, struct dsu_pmu, cpuhp_node);
if (!cpumask_test_and_clear_cpu(cpu, &dsu_pmu->active_cpu))
return 0;
- dst = dsu_pmu_get_online_cpu_any_but(dsu_pmu, cpu);
+ dst = cpumask_any_and_but(&dsu_pmu->associated_cpus,
+ cpu_online_mask, cpu);
/* If there are no active CPUs in the DSU, leave IRQ disabled */
if (dst >= nr_cpu_ids)
return 0;
--
2.27.0
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related
* [PATCH v2 04/10] perf/arm_cspmu: Avoid placing cpumask var on stack
From: Dawei Li @ 2024-04-03 12:51 UTC (permalink / raw)
To: will, mark.rutland, yury.norov, linux
Cc: xueshuai, renyu.zj, yangyicong, jonathan.cameron, andersson,
konrad.dybcio, linux-arm-kernel, linux-kernel, linux-arm-msm,
Dawei Li
In-Reply-To: <20240403125109.2054881-1-dawei.li@shingroup.cn>
For CONFIG_CPUMASK_OFFSTACK=y kernel, explicit allocation of cpumask
variable on stack is not recommended since it can cause potential stack
overflow.
Instead, kernel code should always use *cpumask_var API(s) to allocate
cpumask var in config-neutral way, leaving allocation strategy to
CONFIG_CPUMASK_OFFSTACK.
But dynamic allocation in cpuhp's teardown callback is somewhat problematic
for if allocation fails(which is unlikely but still possible):
- If -ENOMEM is returned to caller, kernel crashes for non-bringup
teardown;
- If callback pretends nothing happened and returns 0 to caller, it may
trap system into an in-consisitent/compromised state;
Use newly-introduced cpumask_any_and_but() to address all issues above.
It eliminates usage of temporary cpumask var in generic way, no matter how
the cpumask var is allocated.
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Dawei Li <dawei.li@shingroup.cn>
---
drivers/perf/arm_cspmu/arm_cspmu.c | 8 +++-----
1 file changed, 3 insertions(+), 5 deletions(-)
diff --git a/drivers/perf/arm_cspmu/arm_cspmu.c b/drivers/perf/arm_cspmu/arm_cspmu.c
index b9a252272f1e..fd1004251665 100644
--- a/drivers/perf/arm_cspmu/arm_cspmu.c
+++ b/drivers/perf/arm_cspmu/arm_cspmu.c
@@ -1322,8 +1322,7 @@ static int arm_cspmu_cpu_online(unsigned int cpu, struct hlist_node *node)
static int arm_cspmu_cpu_teardown(unsigned int cpu, struct hlist_node *node)
{
- int dst;
- struct cpumask online_supported;
+ unsigned int dst;
struct arm_cspmu *cspmu =
hlist_entry_safe(node, struct arm_cspmu, cpuhp_node);
@@ -1333,9 +1332,8 @@ static int arm_cspmu_cpu_teardown(unsigned int cpu, struct hlist_node *node)
return 0;
/* Choose a new CPU to migrate ownership of the PMU to */
- cpumask_and(&online_supported, &cspmu->associated_cpus,
- cpu_online_mask);
- dst = cpumask_any_but(&online_supported, cpu);
+ dst = cpumask_any_and_but(&cspmu->associated_cpus,
+ cpu_online_mask, cpu);
if (dst >= nr_cpu_ids)
return 0;
--
2.27.0
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related
* [PATCH v2 03/10] perf/arm-cmn: Avoid placing cpumask var on stack
From: Dawei Li @ 2024-04-03 12:51 UTC (permalink / raw)
To: will, mark.rutland, yury.norov, linux
Cc: xueshuai, renyu.zj, yangyicong, jonathan.cameron, andersson,
konrad.dybcio, linux-arm-kernel, linux-kernel, linux-arm-msm,
Dawei Li
In-Reply-To: <20240403125109.2054881-1-dawei.li@shingroup.cn>
For CONFIG_CPUMASK_OFFSTACK=y kernel, explicit allocation of cpumask
variable on stack is not recommended since it can cause potential stack
overflow.
Instead, kernel code should always use *cpumask_var API(s) to allocate
cpumask var in config-neutral way, leaving allocation strategy to
CONFIG_CPUMASK_OFFSTACK.
But dynamic allocation in cpuhp's teardown callback is somewhat problematic
for if allocation fails(which is unlikely but still possible):
- If -ENOMEM is returned to caller, kernel crashes for non-bringup
teardown;
- If callback pretends nothing happened and returns 0 to caller, it may
trap system into an in-consisitent/compromised state;
Use newly-introduced cpumask_any_and_but() to address all issues above.
It eliminates usage of temporary cpumask var in generic way, no matter how
the cpumask var is allocated.
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Dawei Li <dawei.li@shingroup.cn>
---
drivers/perf/arm-cmn.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/drivers/perf/arm-cmn.c b/drivers/perf/arm-cmn.c
index 7ef9c7e4836b..6bfb0c4a1287 100644
--- a/drivers/perf/arm-cmn.c
+++ b/drivers/perf/arm-cmn.c
@@ -1950,20 +1950,20 @@ static int arm_cmn_pmu_offline_cpu(unsigned int cpu, struct hlist_node *cpuhp_no
struct arm_cmn *cmn;
unsigned int target;
int node;
- cpumask_t mask;
cmn = hlist_entry_safe(cpuhp_node, struct arm_cmn, cpuhp_node);
if (cpu != cmn->cpu)
return 0;
node = dev_to_node(cmn->dev);
- if (cpumask_and(&mask, cpumask_of_node(node), cpu_online_mask) &&
- cpumask_andnot(&mask, &mask, cpumask_of(cpu)))
- target = cpumask_any(&mask);
- else
+
+ target = cpumask_any_and_but(cpumask_of_node(node), cpu_online_mask, cpu);
+ if (target >= nr_cpu_ids)
target = cpumask_any_but(cpu_online_mask, cpu);
+
if (target < nr_cpu_ids)
arm_cmn_migrate(cmn, target);
+
return 0;
}
--
2.27.0
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related
* [PATCH v2 02/10] perf/alibaba_uncore_drw: Avoid placing cpumask var on stack
From: Dawei Li @ 2024-04-03 12:51 UTC (permalink / raw)
To: will, mark.rutland, yury.norov, linux
Cc: xueshuai, renyu.zj, yangyicong, jonathan.cameron, andersson,
konrad.dybcio, linux-arm-kernel, linux-kernel, linux-arm-msm,
Dawei Li
In-Reply-To: <20240403125109.2054881-1-dawei.li@shingroup.cn>
For CONFIG_CPUMASK_OFFSTACK=y kernel, explicit allocation of cpumask
variable on stack is not recommended since it can cause potential stack
overflow.
Instead, kernel code should always use *cpumask_var API(s) to allocate
cpumask var in config-neutral way, leaving allocation strategy to
CONFIG_CPUMASK_OFFSTACK.
But dynamic allocation in cpuhp's teardown callback is somewhat problematic
for if allocation fails(which is unlikely but still possible):
- If -ENOMEM is returned to caller, kernel crashes for non-bringup
teardown;
- If callback pretends nothing happened and returns 0 to caller, it may
trap system into an in-consisitent/compromised state;
Use newly-introduced cpumask_any_and_but() to address all issues above.
It eliminates usage of temporary cpumask var in generic way, no matter how
the cpumask var is allocated.
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Dawei Li <dawei.li@shingroup.cn>
---
drivers/perf/alibaba_uncore_drw_pmu.c | 10 +++-------
1 file changed, 3 insertions(+), 7 deletions(-)
diff --git a/drivers/perf/alibaba_uncore_drw_pmu.c b/drivers/perf/alibaba_uncore_drw_pmu.c
index a9277dcf90ce..d4d14b65c4a5 100644
--- a/drivers/perf/alibaba_uncore_drw_pmu.c
+++ b/drivers/perf/alibaba_uncore_drw_pmu.c
@@ -746,18 +746,14 @@ static int ali_drw_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node)
struct ali_drw_pmu_irq *irq;
struct ali_drw_pmu *drw_pmu;
unsigned int target;
- int ret;
- cpumask_t node_online_cpus;
irq = hlist_entry_safe(node, struct ali_drw_pmu_irq, node);
if (cpu != irq->cpu)
return 0;
- ret = cpumask_and(&node_online_cpus,
- cpumask_of_node(cpu_to_node(cpu)), cpu_online_mask);
- if (ret)
- target = cpumask_any_but(&node_online_cpus, cpu);
- else
+ target = cpumask_any_and_but(cpumask_of_node(cpu_to_node(cpu)),
+ cpu_online_mask, cpu);
+ if (target >= nr_cpu_ids)
target = cpumask_any_but(cpu_online_mask, cpu);
if (target >= nr_cpu_ids)
--
2.27.0
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related
* [PATCH v2 00/10] perf: Avoid placing cpumask var on stack
From: Dawei Li @ 2024-04-03 12:50 UTC (permalink / raw)
To: will, mark.rutland, yury.norov, linux
Cc: xueshuai, renyu.zj, yangyicong, jonathan.cameron, andersson,
konrad.dybcio, linux-arm-kernel, linux-kernel, linux-arm-msm,
Dawei Li
Hi all,
This is v2 of [1] and [2] which basically eliminate cpumask var allocation
on stack for perf subsystem.
Change since v1:
- Change from dynamic allocation to a temporary var free helper:
cpumask_any_and_but(). [Mark]
- Some minor coding style improvements, reverse chrismas tree e.g.
- For cpumask_any_and_but() itself:
- Moved to cpumask.h, just like other helpers.
- Return value converted to unsigned int.
- Remove EXPORT_SYMBOL, for obvious reason.
[1]:
https://lore.kernel.org/lkml/20240402105610.1695644-1-dawei.li@shingroup.cn/
[2]:
https://lore.kernel.org/lkml/1486381132-5610-1-git-send-email-mark.rutland@arm.com/
Dawei Li (9):
perf/alibaba_uncore_drw: Avoid placing cpumask var on stack
perf/arm-cmn: Avoid placing cpumask var on stack
perf/arm_cspmu: Avoid placing cpumask var on stack
perf/arm_dsu: Avoid placing cpumask var on stack
perf/dwc_pcie: Avoid placing cpumask var on stack
perf/hisi_pcie: Avoid placing cpumask var on stack
perf/hisi_uncore: Avoid placing cpumask var on stack
perf/qcom_l2: Avoid placing cpumask var on stack
perf/thunderx2: Avoid placing cpumask var on stack
Mark Rutland (1):
cpumask: add cpumask_any_and_but()
drivers/perf/alibaba_uncore_drw_pmu.c | 10 +++-------
drivers/perf/arm-cmn.c | 10 +++++-----
drivers/perf/arm_cspmu/arm_cspmu.c | 8 +++-----
drivers/perf/arm_dsu_pmu.c | 19 ++++++-------------
drivers/perf/dwc_pcie_pmu.c | 10 ++++------
drivers/perf/hisilicon/hisi_pcie_pmu.c | 9 ++++-----
drivers/perf/hisilicon/hisi_uncore_pmu.c | 6 ++----
drivers/perf/qcom_l2_pmu.c | 8 +++-----
drivers/perf/thunderx2_pmu.c | 10 +++-------
include/linux/cpumask.h | 23 +++++++++++++++++++++++
10 files changed, 56 insertions(+), 57 deletions(-)
Thanks,
Dawei
--
2.27.0
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply
* Re: [PATCH v4 00/15] Unified cross-architecture kernel-mode FPU API
From: Christian König @ 2024-04-03 12:51 UTC (permalink / raw)
To: Samuel Holland, Andrew Morton, linux-arm-kernel, x86
Cc: linux-kernel, linux-arch, linuxppc-dev, linux-riscv,
Christoph Hellwig, loongarch, amd-gfx, Borislav Petkov,
Catalin Marinas, Dave Hansen, Huacai Chen, Ingo Molnar,
Jonathan Corbet, Masahiro Yamada, Nathan Chancellor,
Nicolas Schier, Russell King, Thomas Gleixner, Will Deacon,
linux-doc, linux-kbuild, Wentland, Harry, Rodrigo Siqueira
In-Reply-To: <20240329072441.591471-1-samuel.holland@sifive.com>
I only skimmed over the platform patches and spend only a few minutes on
the amdgpu stuff.
From what I've seen this series seems to make perfect sense to me, I
just can't fully judge everything.
So feel free to add Acked-by: Christian König <christian.koenig@amd.com>
but I strongly suggest that Harry and Rodrigo take a look as well.
Regards,
Christian.
Am 29.03.24 um 08:18 schrieb Samuel Holland:
> This series unifies the kernel-mode FPU API across several architectures
> by wrapping the existing functions (where needed) in consistently-named
> functions placed in a consistent header location, with mostly the same
> semantics: they can be called from preemptible or non-preemptible task
> context, and are not assumed to be reentrant. Architectures are also
> expected to provide CFLAGS adjustments for compiling FPU-dependent code.
> For the moment, SIMD/vector units are out of scope for this common API.
>
> This allows us to remove the ifdeffery and duplicated Makefile logic at
> each FPU user. It then implements the common API on RISC-V, and converts
> a couple of users to the new API: the AMDGPU DRM driver, and the FPU
> self test.
>
> The underlying goal of this series is to allow using newer AMD GPUs
> (e.g. Navi) on RISC-V boards such as SiFive's HiFive Unmatched. Those
> GPUs need CONFIG_DRM_AMD_DC_FP to initialize, which requires kernel-mode
> FPU support.
>
> Previous versions:
> v3: https://lore.kernel.org/linux-kernel/20240327200157.1097089-1-samuel.holland@sifive.com/
> v2: https://lore.kernel.org/linux-kernel/20231228014220.3562640-1-samuel.holland@sifive.com/
> v1: https://lore.kernel.org/linux-kernel/20231208055501.2916202-1-samuel.holland@sifive.com/
> v0: https://lore.kernel.org/linux-kernel/20231122030621.3759313-1-samuel.holland@sifive.com/
>
> Changes in v4:
> - Add missed CFLAGS changes for recov_neon_inner.c
> (fixes arm build failures)
> - Fix x86 include guard issue (fixes x86 build failures)
>
> Changes in v3:
> - Rebase on v6.9-rc1
> - Limit riscv ARCH_HAS_KERNEL_FPU_SUPPORT to 64BIT
>
> Changes in v2:
> - Add documentation explaining the built-time and runtime APIs
> - Add a linux/fpu.h header for generic isolation enforcement
> - Remove file name from header comment
> - Clean up arch/arm64/lib/Makefile, like for arch/arm
> - Remove RISC-V architecture-specific preprocessor check
> - Split altivec removal to a separate patch
> - Use linux/fpu.h instead of asm/fpu.h in consumers
> - Declare test_fpu() in a header
>
> Michael Ellerman (1):
> drm/amd/display: Only use hard-float, not altivec on powerpc
>
> Samuel Holland (14):
> arch: Add ARCH_HAS_KERNEL_FPU_SUPPORT
> ARM: Implement ARCH_HAS_KERNEL_FPU_SUPPORT
> ARM: crypto: Use CC_FLAGS_FPU for NEON CFLAGS
> arm64: Implement ARCH_HAS_KERNEL_FPU_SUPPORT
> arm64: crypto: Use CC_FLAGS_FPU for NEON CFLAGS
> lib/raid6: Use CC_FLAGS_FPU for NEON CFLAGS
> LoongArch: Implement ARCH_HAS_KERNEL_FPU_SUPPORT
> powerpc: Implement ARCH_HAS_KERNEL_FPU_SUPPORT
> x86/fpu: Fix asm/fpu/types.h include guard
> x86: Implement ARCH_HAS_KERNEL_FPU_SUPPORT
> riscv: Add support for kernel-mode FPU
> drm/amd/display: Use ARCH_HAS_KERNEL_FPU_SUPPORT
> selftests/fpu: Move FP code to a separate translation unit
> selftests/fpu: Allow building on other architectures
>
> Documentation/core-api/floating-point.rst | 78 +++++++++++++++++++
> Documentation/core-api/index.rst | 1 +
> Makefile | 5 ++
> arch/Kconfig | 6 ++
> arch/arm/Kconfig | 1 +
> arch/arm/Makefile | 7 ++
> arch/arm/include/asm/fpu.h | 15 ++++
> arch/arm/lib/Makefile | 3 +-
> arch/arm64/Kconfig | 1 +
> arch/arm64/Makefile | 9 ++-
> arch/arm64/include/asm/fpu.h | 15 ++++
> arch/arm64/lib/Makefile | 6 +-
> arch/loongarch/Kconfig | 1 +
> arch/loongarch/Makefile | 5 +-
> arch/loongarch/include/asm/fpu.h | 1 +
> arch/powerpc/Kconfig | 1 +
> arch/powerpc/Makefile | 5 +-
> arch/powerpc/include/asm/fpu.h | 28 +++++++
> arch/riscv/Kconfig | 1 +
> arch/riscv/Makefile | 3 +
> arch/riscv/include/asm/fpu.h | 16 ++++
> arch/riscv/kernel/Makefile | 1 +
> arch/riscv/kernel/kernel_mode_fpu.c | 28 +++++++
> arch/x86/Kconfig | 1 +
> arch/x86/Makefile | 20 +++++
> arch/x86/include/asm/fpu.h | 13 ++++
> arch/x86/include/asm/fpu/types.h | 6 +-
> drivers/gpu/drm/amd/display/Kconfig | 2 +-
> .../gpu/drm/amd/display/amdgpu_dm/dc_fpu.c | 35 +--------
> drivers/gpu/drm/amd/display/dc/dml/Makefile | 36 +--------
> drivers/gpu/drm/amd/display/dc/dml2/Makefile | 36 +--------
> include/linux/fpu.h | 12 +++
> lib/Kconfig.debug | 2 +-
> lib/Makefile | 26 +------
> lib/raid6/Makefile | 33 +++-----
> lib/test_fpu.h | 8 ++
> lib/{test_fpu.c => test_fpu_glue.c} | 37 ++-------
> lib/test_fpu_impl.c | 37 +++++++++
> 38 files changed, 348 insertions(+), 193 deletions(-)
> create mode 100644 Documentation/core-api/floating-point.rst
> create mode 100644 arch/arm/include/asm/fpu.h
> create mode 100644 arch/arm64/include/asm/fpu.h
> create mode 100644 arch/powerpc/include/asm/fpu.h
> create mode 100644 arch/riscv/include/asm/fpu.h
> create mode 100644 arch/riscv/kernel/kernel_mode_fpu.c
> create mode 100644 arch/x86/include/asm/fpu.h
> create mode 100644 include/linux/fpu.h
> create mode 100644 lib/test_fpu.h
> rename lib/{test_fpu.c => test_fpu_glue.c} (71%)
> create mode 100644 lib/test_fpu_impl.c
>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply
* Re: [PATCH] media: mediatek: vcodec: fix the error sizeimage for 10bit bitstream
From: Sebastian Fricke @ 2024-04-03 12:39 UTC (permalink / raw)
To: Yunfei Dong
Cc: Nícolas F . R . A . Prado, Nicolas Dufresne, Hans Verkuil,
AngeloGioacchino Del Regno, Benjamin Gaignard, Nathan Hebert,
Hsin-Yi Wang, Fritz Koenig, Daniel Vetter, Steve Cho, linux-media,
devicetree, linux-kernel, linux-arm-kernel, linux-mediatek,
Project_Global_Chrome_Upstream_Group
In-Reply-To: <20240403093018.13168-1-yunfei.dong@mediatek.com>
Hey Yunfei,
On 03.04.2024 17:30, Yunfei Dong wrote:
>The sizeimage of each plane are calculated the same way for 8bit and
s/The sizeimage of each plane are/The sizeimage for each plane is/
>10bit bitstream. Need to enlarge the sizeimage with simeimage*5/4 for
>10bit bitstream when try and set fmt.
s/bitstream/bistreams/
s/Need to enlarge the sizeimage with simeimage*5/4 for 10bit bitstream when try and set fmt./
Scale up the sizeimage by 25% for 10-bit bitstreams in try_fmt./
>
>Fixes: 9d86be9bda6c ("media: mediatek: vcodec: Add driver to support 10bit")
>Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
>---
> .../mediatek/vcodec/decoder/mtk_vcodec_dec.c | 47 ++++++++++++++-----
> 1 file changed, 34 insertions(+), 13 deletions(-)
>
>diff --git a/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec.c b/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec.c
>index 9107707de6c4..45209894f1fe 100644
>--- a/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec.c
>+++ b/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec.c
>@@ -259,6 +259,7 @@ static int vidioc_try_fmt(struct mtk_vcodec_dec_ctx *ctx, struct v4l2_format *f,
> pix_fmt_mp->num_planes = 1;
> pix_fmt_mp->plane_fmt[0].bytesperline = 0;
> } else if (f->type == V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE) {
>+ unsigned int dram_y, dram_c, dram_y_10bit, dram_c_10bit;
> int tmp_w, tmp_h;
>
> /*
>@@ -280,22 +281,42 @@ static int vidioc_try_fmt(struct mtk_vcodec_dec_ctx *ctx, struct v4l2_format *f,
> (pix_fmt_mp->height + 64) <= frmsize->max_height)
> pix_fmt_mp->height += 64;
>
>- mtk_v4l2_vdec_dbg(0, ctx,
>- "before resize wxh=%dx%d, after resize wxh=%dx%d, sizeimage=%d",
>- tmp_w, tmp_h, pix_fmt_mp->width, pix_fmt_mp->height,
>- pix_fmt_mp->width * pix_fmt_mp->height);
>+ dram_y = pix_fmt_mp->width * pix_fmt_mp->height;
>+ dram_c = dram_y / 2;
>+
>+ dram_y_10bit = dram_y * 5 / 4;
>+ dram_c_10bit = dram_y_10bit / 2;
I'd skip the two 10 bit variables (dram_y_10bit & dram_c_10bit) and
instead do it like this:
```
dram_stride = pix_fmt_mp->width;
if (ctx->is_10bit_bitstream)
dram_stride = dram_stride * 5 / 4;
dram_y = dram_stride * pix_fmt_mp->height;
dram_c = dram_y / 2;
if (pix_fmt_mp->num_planes == 1) {
pix_fmt_mp->plane_fmt[0].bytesperline = dram_stride;
pix_fmt_mp->plane_fmt[0].sizeimage = dram_y + dram_c;
} else {
pix_fmt_mp->plane_fmt[0].bytesperline = dram_stride;
pix_fmt_mp->plane_fmt[1].bytesperline = dram_stride;
pix_fmt_mp->plane_fmt[0].sizeimage = dram_y;
pix_fmt_mp->plane_fmt[1].sizeimage = dram_c;
...
}
```
Also, why do you call all the variables dram?
Please this isn't tested, but shows the general direction to repeat a
lot less code.
Greetings,
Sebastian
>
> pix_fmt_mp->num_planes = fmt->num_planes;
>- pix_fmt_mp->plane_fmt[0].sizeimage =
>- pix_fmt_mp->width * pix_fmt_mp->height;
>- pix_fmt_mp->plane_fmt[0].bytesperline = pix_fmt_mp->width;
>-
>- if (pix_fmt_mp->num_planes == 2) {
>- pix_fmt_mp->plane_fmt[1].sizeimage =
>- (pix_fmt_mp->width * pix_fmt_mp->height) / 2;
>- pix_fmt_mp->plane_fmt[1].bytesperline =
>- pix_fmt_mp->width;
>+ if (pix_fmt_mp->num_planes == 1) {
>+ if (ctx->is_10bit_bitstream) {
>+ pix_fmt_mp->plane_fmt[0].bytesperline = pix_fmt_mp->width * 5 / 4;
>+ pix_fmt_mp->plane_fmt[0].sizeimage = dram_y_10bit + dram_c_10bit;
>+ } else {
>+ pix_fmt_mp->plane_fmt[0].bytesperline = pix_fmt_mp->width;
>+ pix_fmt_mp->plane_fmt[0].sizeimage = dram_y + dram_c;
>+ }
>+ } else {
>+ if (ctx->is_10bit_bitstream) {
>+ pix_fmt_mp->plane_fmt[0].bytesperline = pix_fmt_mp->width * 5 / 4;
>+ pix_fmt_mp->plane_fmt[1].bytesperline = pix_fmt_mp->width * 5 / 4;
>+
>+ pix_fmt_mp->plane_fmt[0].sizeimage = dram_y_10bit;
>+ pix_fmt_mp->plane_fmt[1].sizeimage = dram_c_10bit;
>+ } else {
>+ pix_fmt_mp->plane_fmt[0].bytesperline = pix_fmt_mp->width;
>+ pix_fmt_mp->plane_fmt[1].bytesperline = pix_fmt_mp->width;
>+
>+ pix_fmt_mp->plane_fmt[0].sizeimage = dram_y;
>+ pix_fmt_mp->plane_fmt[1].sizeimage = dram_c;
>+ }
> }
>+
>+ mtk_v4l2_vdec_dbg(0, ctx,
>+ "before resize:%dx%d, after resize:%dx%d, sizeimage=0x%x_0x%x",
>+ tmp_w, tmp_h, pix_fmt_mp->width, pix_fmt_mp->height,
>+ pix_fmt_mp->plane_fmt[0].sizeimage,
>+ pix_fmt_mp->plane_fmt[1].sizeimage);
> }
>
> pix_fmt_mp->flags = 0;
>--
>2.25.1
>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply
* Re: [PATCH v4 0/4] Update Energy Model after chip binning adjusted voltages
From: Lukasz Luba @ 2024-04-03 12:36 UTC (permalink / raw)
To: Dietmar Eggemann
Cc: linux-arm-kernel, sboyd, rafael, linux-pm, nm, linux-samsung-soc,
daniel.lezcano, viresh.kumar, krzysztof.kozlowski, alim.akhtar,
m.szyprowski, mhiramat, linux-kernel
In-Reply-To: <045fa6db-4f76-46aa-85ba-c9e698c7e390@arm.com>
Hi Dietmar,
On 4/3/24 13:07, Dietmar Eggemann wrote:
> On 02/04/2024 17:58, Lukasz Luba wrote:
>> Hi all,
>>
>> This is a follow-up patch aiming to add EM modification due to chip binning.
>> The first RFC and the discussion can be found here [1].
>>
>> It uses Exynos chip driver code as a 1st user. The EM framework has been
>> extended to handle this use case easily, when the voltage has been changed
>> after setup. On my Odroid-xu4 in some OPPs I can observe ~20% power difference.
>> According to that data in driver tables it could be up to ~29%.
>>
>> This chip binning is applicable to a lot of SoCs, so the EM framework should
>> make it easy to update. It uses the existing OPP and DT information to
>> re-calculate the new power values.
>>
>> It has dependency on Exynos SoC driver tree.
>>
>> Changes:
>> v4:
>> - added asterisk in the comment section (test robot)
>> - change the patch 2/4 header name and use 'Refactor'
>> v3:
>> - updated header description patch 2/4 (Dietmar)
>> - removed 2 sentences from comment and adjusted in patch 3/4 (Dietmar)
>> - patch 4/4 re-phrased code comment (Dietmar)
>> - collected tags (Krzysztof, Viresh)
>> v2:
>> - removed 'ret' from error message which wasn't initialized (Christian)
>> v1:
>> - exported the OPP calculation function from the OPP/OF so it can be
>> used from EM fwk (Viresh)
>> - refactored EM updating function to re-use common code
>> - added new EM function which can be used by chip device drivers which
>> modify the voltage in OPPs
>> RFC is at [1]
>>
>> Regards,
>> Lukasz Luba
>>
>> [1] https://lore.kernel.org/lkml/20231220110339.1065505-1-lukasz.luba@arm.com/
>>
>> Lukasz Luba (4):
>> OPP: OF: Export dev_opp_pm_calc_power() for usage from EM
>> PM: EM: Refactor em_adjust_new_capacity()
>> PM: EM: Add em_dev_update_chip_binning()
>> soc: samsung: exynos-asv: Update Energy Model after adjusting voltage
>>
>> drivers/opp/of.c | 17 +++--
>> drivers/soc/samsung/exynos-asv.c | 11 +++-
>> include/linux/energy_model.h | 5 ++
>> include/linux/pm_opp.h | 8 +++
>> kernel/power/energy_model.c | 106 +++++++++++++++++++++++++------
>> 5 files changed, 122 insertions(+), 25 deletions(-)
>
> LGTM.
>
> Just two very minor things which I mentioned in the individual patches.
>
> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
>
>
>
Thank you for the review. I will send the v5 with those.
Regards,
Lukasz
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply
* Re: [PATCH v2 02/18] PCI: endpoint: Introduce pci_epc_map_align()
From: Kishon Vijay Abraham I @ 2024-04-03 12:33 UTC (permalink / raw)
To: Damien Le Moal, Manivannan Sadhasivam, Lorenzo Pieralisi,
Kishon Vijay Abraham I, Shawn Lin, Krzysztof Wilczyński,
Bjorn Helgaas, Heiko Stuebner, linux-pci, Rob Herring,
Krzysztof Kozlowski, Conor Dooley, devicetree
Cc: linux-rockchip, linux-arm-kernel, Rick Wertenbroek,
Wilfred Mallawa, Niklas Cassel
In-Reply-To: <20240330041928.1555578-3-dlemoal@kernel.org>
Hi Damien,
On 3/30/2024 9:49 AM, Damien Le Moal wrote:
> Some endpoint controllers have requirements on the alignment of the
> controller physical memory address that must be used to map a RC PCI
> address region. For instance, the rockchip endpoint controller uses
> at most the lower 20 bits of a physical memory address region as the
> lower bits of an RC PCI address. For mapping a PCI address region of
> size bytes starting from pci_addr, the exact number of address bits
> used is the number of address bits changing in the address range
> [pci_addr..pci_addr + size - 1].
>
> For this example, this creates the following constraints:
> 1) The offset into the controller physical memory allocated for a
> mapping depends on the mapping size *and* the starting PCI address
> for the mapping.
> 2) A mapping size cannot exceed the controller windows size (1MB) minus
> the offset needed into the allocated physical memory, which can end
> up being a smaller size than the desired mapping size.
>
> Handling these constraints independently of the controller being used in
> a PCI EP function driver is not possible with the current EPC API as
> it only provides the ->align field in struct pci_epc_features.
> Furthermore, this alignment is static and does not depend on a mapping
> pci address and size.
>
> Solve this by introducing the function pci_epc_map_align() and the
> endpoint controller operation ->map_align to allow endpoint function
> drivers to obtain the size and the offset into a controller address
> region that must be used to map an RC PCI address region. The size
> of the physical address region provided by pci_epc_map_align() can then
> be used as the size argument for the function pci_epc_mem_alloc_addr().
> The offset into the allocated controller memory can be used to
> correctly handle data transfers. Of note is that pci_epc_map_align() may
> indicate upon return a mapping size that is smaller (but not 0) than the
> requested PCI address region size. For such case, an endpoint function
> driver must handle data transfers in fragments.
>
> The controller operation ->map_align is optional: controllers that do
> not have any address alignment constraints for mapping a RC PCI address
> region do not need to implement this operation. For such controllers,
> pci_epc_map_align() always returns the mapping size as equal
> to the requested size and an offset equal to 0.
>
> The structure pci_epc_map is introduced to represent a mapping start PCI
> address, size and the size and offset into the controller memory needed
> for mapping the PCI address region.
>
> Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
> ---
> drivers/pci/endpoint/pci-epc-core.c | 66 +++++++++++++++++++++++++++++
> include/linux/pci-epc.h | 33 +++++++++++++++
> 2 files changed, 99 insertions(+)
>
> diff --git a/drivers/pci/endpoint/pci-epc-core.c b/drivers/pci/endpoint/pci-epc-core.c
> index 754afd115bbd..37758ca91d7f 100644
> --- a/drivers/pci/endpoint/pci-epc-core.c
> +++ b/drivers/pci/endpoint/pci-epc-core.c
> @@ -433,6 +433,72 @@ void pci_epc_unmap_addr(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
> }
> EXPORT_SYMBOL_GPL(pci_epc_unmap_addr);
>
> +/**
> + * pci_epc_map_align() - Get the offset into and the size of a controller memory
> + * address region needed to map a RC PCI address region
> + * @epc: the EPC device on which address is allocated
> + * @func_no: the physical endpoint function number in the EPC device
> + * @vfunc_no: the virtual endpoint function number in the physical function
> + * @pci_addr: PCI address to which the physical address should be mapped
> + * @size: the size of the mapping starting from @pci_addr
> + * @map: populate here the actual size and offset into the controller memory
> + * that must be allocated for the mapping
> + *
> + * Invoke the controller map_align operation to obtain the size and the offset
> + * into a controller address region that must be allocated to map @size
> + * bytes of the RC PCI address space starting from @pci_addr.
> + *
> + * The size of the mapping that can be handled by the controller is indicated
> + * using the pci_size field of @map. This size may be smaller than the requested
> + * @size. In such case, the function driver must handle the mapping using
> + * several fragments. The offset into the controller memory for the effective
> + * mapping of the @pci_addr..@pci_addr+@map->pci_size address range is indicated
> + * using the map_ofst field of @map.
> + */
> +int pci_epc_map_align(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
> + u64 pci_addr, size_t size, struct pci_epc_map *map)
> +{
> + const struct pci_epc_features *features;
> + size_t mask;
> + int ret;
> +
> + if (!pci_epc_function_is_valid(epc, func_no, vfunc_no))
> + return -EINVAL;
> +
> + if (!size || !map)
> + return -EINVAL;
> +
> + memset(map, 0, sizeof(*map));
> + map->pci_addr = pci_addr;
> + map->pci_size = size;
> +
> + if (epc->ops->map_align) {
> + mutex_lock(&epc->lock);
> + ret = epc->ops->map_align(epc, func_no, vfunc_no, map);
> + mutex_unlock(&epc->lock);
> + return ret;
> + }
> +
> + /*
> + * Assume a fixed alignment constraint as specified by the controller
> + * features.
> + */
> + features = pci_epc_get_features(epc, func_no, vfunc_no);
> + if (!features || !features->align) {
> + map->map_pci_addr = pci_addr;
> + map->map_size = size;
> + map->map_ofst = 0;
> + }
The 'align' of pci_epc_features was initially added only to address the
inbound ATU constraints. This is also added as comment in [1]. The PCI
address restrictions (only fixed alignment constraint) were handled by
the host side driver and depends on the connected endpoint device
(atleast it was like that for pci_endpoint_test.c [2]).
So pci-epf-test.c used the 'align' in pci_epc_features only as part of
pci_epf_alloc_space().
Though I have abused 'align' of pci_epc_features in pci-epf-ntb.c using
it out of pci_epf_alloc_space(), I think we should keep the 'align' of
pci_epc_features only within pci_epf_alloc_space() and controllers with
any PCI address restrictions to implement ->map_align(). This could as
well be done in a phased manner to let controllers implement
->map_align() and then remove using pci_epc_features in
pci_epc_map_align(). Let me know what you think?
Thanks,
Kishon
[1] ->
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/linux/pci-epc.h?h=v6.9-rc2#n187
[2] ->
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/misc/pci_endpoint_test.c?h=v6.9-rc2#n127
> +
> + mask = features->align - 1;
> + map->map_pci_addr = map->pci_addr & ~mask;
> + map->map_ofst = map->pci_addr & mask;
> + map->map_size = ALIGN(map->map_ofst + map->pci_size, features->align);
> +
> + return 0;
> +}
> +EXPORT_SYMBOL_GPL(pci_epc_map_align);
> +
> /**
> * pci_epc_map_addr() - map CPU address to PCI address
> * @epc: the EPC device on which address is allocated
> diff --git a/include/linux/pci-epc.h b/include/linux/pci-epc.h
> index cc2f70d061c8..8cfb4aaf2628 100644
> --- a/include/linux/pci-epc.h
> +++ b/include/linux/pci-epc.h
> @@ -32,11 +32,40 @@ pci_epc_interface_string(enum pci_epc_interface_type type)
> }
> }
>
> +/**
> + * struct pci_epc_map - information about EPC memory for mapping a RC PCI
> + * address range
> + * @pci_addr: start address of the RC PCI address range to map
> + * @pci_size: size of the RC PCI address range to map
> + * @map_pci_addr: RC PCI address used as the first address mapped
> + * @map_size: size of the controller memory needed for the mapping
> + * @map_ofst: offset into the controller memory needed for the mapping
> + * @phys_base: base physical address of the allocated EPC memory
> + * @phys_addr: physical address at which @pci_addr is mapped
> + * @virt_base: base virtual address of the allocated EPC memory
> + * @virt_addr: virtual address at which @pci_addr is mapped
> + */
> +struct pci_epc_map {
> + phys_addr_t pci_addr;
> + size_t pci_size;
> +
> + phys_addr_t map_pci_addr;
> + size_t map_size;
> + phys_addr_t map_ofst;
> +
> + phys_addr_t phys_base;
> + phys_addr_t phys_addr;
> + void __iomem *virt_base;
> + void __iomem *virt_addr;
> +};
> +
> /**
> * struct pci_epc_ops - set of function pointers for performing EPC operations
> * @write_header: ops to populate configuration space header
> * @set_bar: ops to configure the BAR
> * @clear_bar: ops to reset the BAR
> + * @map_align: operation to get the size and offset into a controller memory
> + * window needed to map an RC PCI address region
> * @map_addr: ops to map CPU address to PCI address
> * @unmap_addr: ops to unmap CPU address and PCI address
> * @set_msi: ops to set the requested number of MSI interrupts in the MSI
> @@ -61,6 +90,8 @@ struct pci_epc_ops {
> struct pci_epf_bar *epf_bar);
> void (*clear_bar)(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
> struct pci_epf_bar *epf_bar);
> + int (*map_align)(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
> + struct pci_epc_map *map);
> int (*map_addr)(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
> phys_addr_t addr, u64 pci_addr, size_t size);
> void (*unmap_addr)(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
> @@ -234,6 +265,8 @@ int pci_epc_set_bar(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
> struct pci_epf_bar *epf_bar);
> void pci_epc_clear_bar(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
> struct pci_epf_bar *epf_bar);
> +int pci_epc_map_align(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
> + u64 pci_addr, size_t size, struct pci_epc_map *map);
> int pci_epc_map_addr(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
> phys_addr_t phys_addr,
> u64 pci_addr, size_t size);
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply
* Re: [PATCH v4 05/13] mm/arch: Provide pud_pfn() fallback
From: Christophe Leroy @ 2024-04-03 12:26 UTC (permalink / raw)
To: Jason Gunthorpe, Peter Xu
Cc: Nathan Chancellor, linux-mm@kvack.org,
linux-kernel@vger.kernel.org, Yang Shi, Kirill A . Shutemov,
Mike Kravetz, John Hubbard, Michael Ellerman, Andrew Jones,
Muchun Song, linux-riscv@lists.infradead.org,
linuxppc-dev@lists.ozlabs.org, Andrew Morton, Christoph Hellwig,
Lorenzo Stoakes, Matthew Wilcox, Rik van Riel,
linux-arm-kernel@lists.infradead.org, Andrea Arcangeli,
David Hildenbrand, Aneesh Kumar K . V, Vlastimil Babka,
James Houghton, Mike Rapoport, Axel Rasmussen, Huacai Chen,
WANG Xuerui, loongarch@lists.linux.dev
In-Reply-To: <20240403120841.GB1723999@nvidia.com>
Le 03/04/2024 à 14:08, Jason Gunthorpe a écrit :
> On Tue, Apr 02, 2024 at 07:35:45PM -0400, Peter Xu wrote:
>> On Tue, Apr 02, 2024 at 07:53:20PM -0300, Jason Gunthorpe wrote:
>>> On Tue, Apr 02, 2024 at 06:43:56PM -0400, Peter Xu wrote:
>>>
>>>> I actually tested this without hitting the issue (even though I didn't
>>>> mention it in the cover letter..). I re-kicked the build test, it turns
>>>> out my "make alldefconfig" on loongarch will generate a config with both
>>>> HUGETLB=n && THP=n, while arch/loongarch/configs/loongson3_defconfig has
>>>> THP=y (which I assume was the one above build used). I didn't further
>>>> check how "make alldefconfig" generated the config; a bit surprising that
>>>> it didn't fetch from there.
>>>
>>> I suspect it is weird compiler variations.. Maybe something is not
>>> being inlined.
>>>
>>>> (and it also surprises me that this BUILD_BUG can trigger.. I used to try
>>>> triggering it elsewhere but failed..)
>>>
>>> As the pud_leaf() == FALSE should result in the BUILD_BUG never being
>>> called and the optimizer removing it.
>>
>> Good point, for some reason loongarch defined pud_leaf() without defining
>> pud_pfn(), which does look strange.
>>
>> #define pud_leaf(pud) ((pud_val(pud) & _PAGE_HUGE) != 0)
>>
>> But I noticed at least MIPS also does it.. Logically I think one arch
>> should define either none of both.
>
> Wow, this is definately an arch issue. You can't define pud_leaf() and
> not have a pud_pfn(). It makes no sense at all..
>
> I'd say the BUILD_BUG has done it's job and found an issue, fix it by
> not defining pud_leaf? I don't see any calls to pud_leaf in loongarch
> at least
As far as I can see it was added by commit 303be4b33562 ("LoongArch: mm:
Add p?d_leaf() definitions").
Not sure it was added for a good reason, and I'm not sure what was added
is correct because arch/loongarch/include/asm/pgtable-bits.h has:
#define _PAGE_HUGE_SHIFT 6 /* HUGE is a PMD bit */
So I'm not sure it is correct to use that bit for PUD, is it ?
Probably pud_leaf() should always return false.
Christophe
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply
* Re: [PATCH v4 0/5] TEE driver for Trusted Services
From: Jens Wiklander @ 2024-04-03 12:13 UTC (permalink / raw)
To: Balint Dobszay
Cc: op-tee, linux-doc, linux-kernel, linux-arm-kernel, sumit.garg,
corbet, sudeep.holla, rdunlap, krzk, gyorgy.szing
In-Reply-To: <20240325151105.135667-1-balint.dobszay@arm.com>
On Mon, Mar 25, 2024 at 4:11 PM Balint Dobszay <balint.dobszay@arm.com> wrote:
>
> This series introduces a TEE driver for Trusted Services [1].
>
> Trusted Services is a TrustedFirmware.org project that provides a
> framework for developing and deploying device Root of Trust services in
> FF-A [2] Secure Partitions. The project hosts the reference
> implementation of Arm Platform Security Architecture [3] for Arm
> A-profile devices.
>
> The FF-A Secure Partitions are accessible through the FF-A driver in
> Linux. However, the FF-A driver doesn't have a user space interface so
> user space clients currently cannot access Trusted Services. The goal of
> this TEE driver is to bridge this gap and make Trusted Services
> functionality accessible from user space.
>
> Changelog:
> v3[7] -> v4:
> - Remove unnecessary callbacks from tstee_ops
> - Add maintainers entry for the new driver
>
> v2[6] -> v3:
> - Add patch "tee: Refactor TEE subsystem header files" from Sumit
> - Remove unnecessary includes from core.c
> - Remove the mutex from "struct ts_context_data" since the same
> mechanism could be implemented by reusing the XArray's internal lock
> - Rename tee_shm_pool_op_*_helper functions as suggested by Sumit
> - Replace pr_* with dev_* as previously suggested by Krzysztof
>
> v1[5] -> v2:
> - Refactor session handling to use XArray instead of IDR and linked
> list (the linked list was redundant as pointed out by Jens, and IDR
> is now deprecated in favor of XArray)
> - Refactor tstee_probe() to not call tee_device_unregister() before
> calling tee_device_register()
> - Address comments from Krzysztof and Jens
> - Address documentation comments from Randy
> - Use module_ffa_driver() macro instead of separate module init / exit
> functions
> - Reformat max line length 100 -> 80
>
> RFC[4] -> v1:
> - Add patch for moving pool_op helper functions to the TEE subsystem,
> as suggested by Jens
> - Address comments from Sumit, add patch for documentation
>
> [1] https://www.trustedfirmware.org/projects/trusted-services/
> [2] https://developer.arm.com/documentation/den0077/
> [3] https://www.arm.com/architecture/security-features/platform-security
> [4] https://lore.kernel.org/linux-arm-kernel/20230927152145.111777-1-balint.dobszay@arm.com/
> [5] https://lore.kernel.org/lkml/20240213145239.379875-1-balint.dobszay@arm.com/
> [6] https://lore.kernel.org/lkml/20240223095133.109046-1-balint.dobszay@arm.com/
> [7] https://lore.kernel.org/lkml/20240305101745.213933-1-balint.dobszay@arm.com/
>
>
> Balint Dobszay (4):
> tee: optee: Move pool_op helper functions
> tee: tstee: Add Trusted Services TEE driver
> Documentation: tee: Add TS-TEE driver
> MAINTAINERS: tee: tstee: Add entry
>
> Sumit Garg (1):
> tee: Refactor TEE subsystem header files
>
> Documentation/tee/index.rst | 1 +
> Documentation/tee/ts-tee.rst | 71 ++++
> MAINTAINERS | 10 +
> drivers/tee/Kconfig | 1 +
> drivers/tee/Makefile | 1 +
> drivers/tee/amdtee/amdtee_private.h | 2 +-
> drivers/tee/amdtee/call.c | 2 +-
> drivers/tee/amdtee/core.c | 3 +-
> drivers/tee/amdtee/shm_pool.c | 2 +-
> drivers/tee/optee/call.c | 2 +-
> drivers/tee/optee/core.c | 66 +---
> drivers/tee/optee/device.c | 2 +-
> drivers/tee/optee/ffa_abi.c | 8 +-
> drivers/tee/optee/notif.c | 2 +-
> drivers/tee/optee/optee_private.h | 14 +-
> drivers/tee/optee/rpc.c | 2 +-
> drivers/tee/optee/smc_abi.c | 11 +-
> drivers/tee/tee_core.c | 2 +-
> drivers/tee/tee_private.h | 35 --
> drivers/tee/tee_shm.c | 66 +++-
> drivers/tee/tee_shm_pool.c | 2 +-
> drivers/tee/tstee/Kconfig | 11 +
> drivers/tee/tstee/Makefile | 3 +
> drivers/tee/tstee/core.c | 480 ++++++++++++++++++++++++++++
> drivers/tee/tstee/tstee_private.h | 92 ++++++
> include/linux/tee_core.h | 306 ++++++++++++++++++
> include/linux/tee_drv.h | 285 ++---------------
> include/uapi/linux/tee.h | 1 +
> 28 files changed, 1094 insertions(+), 389 deletions(-)
> create mode 100644 Documentation/tee/ts-tee.rst
> create mode 100644 drivers/tee/tstee/Kconfig
> create mode 100644 drivers/tee/tstee/Makefile
> create mode 100644 drivers/tee/tstee/core.c
> create mode 100644 drivers/tee/tstee/tstee_private.h
> create mode 100644 include/linux/tee_core.h
>
> --
> 2.34.1
>
I'm picking up this patch set.
Thanks,
Jens
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply
* Re: [PATCH v4 05/13] mm/arch: Provide pud_pfn() fallback
From: Jason Gunthorpe @ 2024-04-03 12:08 UTC (permalink / raw)
To: Peter Xu
Cc: Nathan Chancellor, linux-mm, linux-kernel, Yang Shi,
Kirill A . Shutemov, Mike Kravetz, John Hubbard, Michael Ellerman,
Andrew Jones, Muchun Song, linux-riscv, linuxppc-dev,
Christophe Leroy, Andrew Morton, Christoph Hellwig,
Lorenzo Stoakes, Matthew Wilcox, Rik van Riel, linux-arm-kernel,
Andrea Arcangeli, David Hildenbrand, Aneesh Kumar K . V,
Vlastimil Babka, James Houghton, Mike Rapoport, Axel Rasmussen,
Huacai Chen, WANG Xuerui, loongarch
In-Reply-To: <ZgyWUYVdUsAiXCC4@xz-m1.local>
On Tue, Apr 02, 2024 at 07:35:45PM -0400, Peter Xu wrote:
> On Tue, Apr 02, 2024 at 07:53:20PM -0300, Jason Gunthorpe wrote:
> > On Tue, Apr 02, 2024 at 06:43:56PM -0400, Peter Xu wrote:
> >
> > > I actually tested this without hitting the issue (even though I didn't
> > > mention it in the cover letter..). I re-kicked the build test, it turns
> > > out my "make alldefconfig" on loongarch will generate a config with both
> > > HUGETLB=n && THP=n, while arch/loongarch/configs/loongson3_defconfig has
> > > THP=y (which I assume was the one above build used). I didn't further
> > > check how "make alldefconfig" generated the config; a bit surprising that
> > > it didn't fetch from there.
> >
> > I suspect it is weird compiler variations.. Maybe something is not
> > being inlined.
> >
> > > (and it also surprises me that this BUILD_BUG can trigger.. I used to try
> > > triggering it elsewhere but failed..)
> >
> > As the pud_leaf() == FALSE should result in the BUILD_BUG never being
> > called and the optimizer removing it.
>
> Good point, for some reason loongarch defined pud_leaf() without defining
> pud_pfn(), which does look strange.
>
> #define pud_leaf(pud) ((pud_val(pud) & _PAGE_HUGE) != 0)
>
> But I noticed at least MIPS also does it.. Logically I think one arch
> should define either none of both.
Wow, this is definately an arch issue. You can't define pud_leaf() and
not have a pud_pfn(). It makes no sense at all..
I'd say the BUILD_BUG has done it's job and found an issue, fix it by
not defining pud_leaf? I don't see any calls to pud_leaf in loongarch
at least
Jason
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply
* Re: [PATCH v4 4/4] soc: samsung: exynos-asv: Update Energy Model after adjusting voltage
From: Dietmar Eggemann @ 2024-04-03 12:07 UTC (permalink / raw)
To: Lukasz Luba, linux-kernel, linux-pm, rafael
Cc: linux-arm-kernel, sboyd, nm, linux-samsung-soc, daniel.lezcano,
viresh.kumar, krzysztof.kozlowski, alim.akhtar, m.szyprowski,
mhiramat
In-Reply-To: <20240402155822.505491-5-lukasz.luba@arm.com>
On 02/04/2024 17:58, Lukasz Luba wrote:
[...]
> @@ -97,9 +98,17 @@ static int exynos_asv_update_opps(struct exynos_asv *asv)
> last_opp_table = opp_table;
>
> ret = exynos_asv_update_cpu_opps(asv, cpu);
> - if (ret < 0)
> + if (!ret) {
> + /*
> + * When the voltage for OPPs could be changed,
> + * make sure to update the EM power values, to
> + * reflect the reality and not use stale data.
> + */
Maybe shorter?
/*
* Update EM power values since OPP
* voltage values may have changed.
*/
[...]
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply
* Re: [PATCH v4 1/4] OPP: OF: Export dev_opp_pm_calc_power() for usage from EM
From: Dietmar Eggemann @ 2024-04-03 12:07 UTC (permalink / raw)
To: Lukasz Luba, linux-kernel, linux-pm, rafael
Cc: linux-arm-kernel, sboyd, nm, linux-samsung-soc, daniel.lezcano,
viresh.kumar, krzysztof.kozlowski, alim.akhtar, m.szyprowski,
mhiramat
In-Reply-To: <20240402155822.505491-2-lukasz.luba@arm.com>
On 02/04/2024 17:58, Lukasz Luba wrote:
[...]
> @@ -539,6 +541,12 @@ static inline void dev_pm_opp_of_unregister_em(struct device *dev)
> {
> }
>
> +static inline int dev_pm_opp_calc_power(struct device *dev, unsigned long *uW,
> + unsigned long *kHz)
This looks like a weird alignment comparing to the adjacent functions.
[...]
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply
* Re: [PATCH v4 0/4] Update Energy Model after chip binning adjusted voltages
From: Dietmar Eggemann @ 2024-04-03 12:07 UTC (permalink / raw)
To: Lukasz Luba, linux-kernel, linux-pm, rafael
Cc: linux-arm-kernel, sboyd, nm, linux-samsung-soc, daniel.lezcano,
viresh.kumar, krzysztof.kozlowski, alim.akhtar, m.szyprowski,
mhiramat
In-Reply-To: <20240402155822.505491-1-lukasz.luba@arm.com>
On 02/04/2024 17:58, Lukasz Luba wrote:
> Hi all,
>
> This is a follow-up patch aiming to add EM modification due to chip binning.
> The first RFC and the discussion can be found here [1].
>
> It uses Exynos chip driver code as a 1st user. The EM framework has been
> extended to handle this use case easily, when the voltage has been changed
> after setup. On my Odroid-xu4 in some OPPs I can observe ~20% power difference.
> According to that data in driver tables it could be up to ~29%.
>
> This chip binning is applicable to a lot of SoCs, so the EM framework should
> make it easy to update. It uses the existing OPP and DT information to
> re-calculate the new power values.
>
> It has dependency on Exynos SoC driver tree.
>
> Changes:
> v4:
> - added asterisk in the comment section (test robot)
> - change the patch 2/4 header name and use 'Refactor'
> v3:
> - updated header description patch 2/4 (Dietmar)
> - removed 2 sentences from comment and adjusted in patch 3/4 (Dietmar)
> - patch 4/4 re-phrased code comment (Dietmar)
> - collected tags (Krzysztof, Viresh)
> v2:
> - removed 'ret' from error message which wasn't initialized (Christian)
> v1:
> - exported the OPP calculation function from the OPP/OF so it can be
> used from EM fwk (Viresh)
> - refactored EM updating function to re-use common code
> - added new EM function which can be used by chip device drivers which
> modify the voltage in OPPs
> RFC is at [1]
>
> Regards,
> Lukasz Luba
>
> [1] https://lore.kernel.org/lkml/20231220110339.1065505-1-lukasz.luba@arm.com/
>
> Lukasz Luba (4):
> OPP: OF: Export dev_opp_pm_calc_power() for usage from EM
> PM: EM: Refactor em_adjust_new_capacity()
> PM: EM: Add em_dev_update_chip_binning()
> soc: samsung: exynos-asv: Update Energy Model after adjusting voltage
>
> drivers/opp/of.c | 17 +++--
> drivers/soc/samsung/exynos-asv.c | 11 +++-
> include/linux/energy_model.h | 5 ++
> include/linux/pm_opp.h | 8 +++
> kernel/power/energy_model.c | 106 +++++++++++++++++++++++++------
> 5 files changed, 122 insertions(+), 25 deletions(-)
LGTM.
Just two very minor things which I mentioned in the individual patches.
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply
* [PATCH net-next v6 10/10] net: ti: icssg-prueth: Add ICSSG Ethernet driver for AM65x SR1.0 platforms
From: Diogo Ivo @ 2024-04-03 10:48 UTC (permalink / raw)
To: davem, edumazet, kuba, pabeni, danishanwar, rogerq, vigneshr,
arnd, wsa+renesas, vladimir.oltean, andrew, dan.carpenter, netdev,
linux-arm-kernel
Cc: Diogo Ivo, jan.kiszka
In-Reply-To: <20240403104821.283832-1-diogo.ivo@siemens.com>
Add the PRUeth driver for the ICSSG subsystem found in AM65x SR1.0 devices.
The main differences that set SR1.0 and SR2.0 apart are the missing TXPRU
core in SR1.0, two extra DMA channels for management purposes and different
firmware that needs to be configured accordingly.
Based on the work of Roger Quadros, Vignesh Raghavendra and
Grygorii Strashko in TI's 5.10 SDK [1].
[1]: https://git.ti.com/cgit/ti-linux-kernel/ti-linux-kernel/tree/?h=ti-linux-5.10.y
Co-developed-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Diogo Ivo <diogo.ivo@siemens.com>
Reviewed-by: MD Danish Anwar <danishanwar@ti.com>
---
Changes in v5:
- Add missing le32_to_cpu() calls when handling timestamps in
prueth_tx_ts_sr1()
- Declare only one TX channel
- Remove 100Mbit/s half-duplex mode
- Added Reviewed-by tag from Danish
Changes in v4:
- Convert all fields in icssg_config_sr1() to __le32 (reported by sparse)
- Declare data in emac_send_command_sr1() as __le32 (reported by sparse)
- Fix reverse xmastree variable declaration
Changes in v3:
- Remove full duplex check in icssg_config_set_speed_sr1(),
allowing the firmware to be informed of half duplex operation.
This eliminates the need to unconditionally remove half duplex
modes from being advertised.
- Remove call to icssg_config_half_duplex() in emac_adjust_link_sr1()
as for SR1.0 icssg_config_sr1() already provides a rand_seed.
drivers/net/ethernet/ti/Kconfig | 15 +
drivers/net/ethernet/ti/Makefile | 8 +
.../net/ethernet/ti/icssg/icssg_prueth_sr1.c | 1181 +++++++++++++++++
3 files changed, 1204 insertions(+)
create mode 100644 drivers/net/ethernet/ti/icssg/icssg_prueth_sr1.c
diff --git a/drivers/net/ethernet/ti/Kconfig b/drivers/net/ethernet/ti/Kconfig
index 1530d13984d4..deed1fc33e40 100644
--- a/drivers/net/ethernet/ti/Kconfig
+++ b/drivers/net/ethernet/ti/Kconfig
@@ -198,6 +198,21 @@ config TI_ICSSG_PRUETH
to support the Ethernet operation. Currently, it supports Ethernet
with 1G and 100M link speed.
+config TI_ICSSG_PRUETH_SR1
+ tristate "TI Gigabit PRU SR1.0 Ethernet driver"
+ select PHYLIB
+ select TI_ICSS_IEP
+ select TI_K3_CPPI_DESC_POOL
+ depends on PRU_REMOTEPROC
+ depends on ARCH_K3 && OF && TI_K3_UDMA_GLUE_LAYER
+ help
+ Support dual Gigabit Ethernet ports over the ICSSG PRU Subsystem.
+ This subsystem is available on the AM65 SR1.0 platform.
+
+ This driver requires firmware binaries which will run on the PRUs
+ to support the Ethernet operation. Currently, it supports Ethernet
+ with 1G, 100M and 10M link speed.
+
config TI_ICSS_IEP
tristate "TI PRU ICSS IEP driver"
depends on PTP_1588_CLOCK_OPTIONAL
diff --git a/drivers/net/ethernet/ti/Makefile b/drivers/net/ethernet/ti/Makefile
index 4876f20aa495..6e086b4c0384 100644
--- a/drivers/net/ethernet/ti/Makefile
+++ b/drivers/net/ethernet/ti/Makefile
@@ -40,4 +40,12 @@ icssg-prueth-y := icssg/icssg_prueth.o \
icssg/icssg_mii_cfg.o \
icssg/icssg_stats.o \
icssg/icssg_ethtool.o
+obj-$(CONFIG_TI_ICSSG_PRUETH_SR1) += icssg-prueth-sr1.o
+icssg-prueth-sr1-y := icssg/icssg_prueth_sr1.o \
+ icssg/icssg_common.o \
+ icssg/icssg_classifier.o \
+ icssg/icssg_config.o \
+ icssg/icssg_mii_cfg.o \
+ icssg/icssg_stats.o \
+ icssg/icssg_ethtool.o
obj-$(CONFIG_TI_ICSS_IEP) += icssg/icss_iep.o
diff --git a/drivers/net/ethernet/ti/icssg/icssg_prueth_sr1.c b/drivers/net/ethernet/ti/icssg/icssg_prueth_sr1.c
new file mode 100644
index 000000000000..7b3304bbd7fc
--- /dev/null
+++ b/drivers/net/ethernet/ti/icssg/icssg_prueth_sr1.c
@@ -0,0 +1,1181 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/* Texas Instruments ICSSG SR1.0 Ethernet Driver
+ *
+ * Copyright (C) 2018-2022 Texas Instruments Incorporated - https://www.ti.com/
+ * Copyright (c) Siemens AG, 2024
+ *
+ */
+
+#include <linux/etherdevice.h>
+#include <linux/genalloc.h>
+#include <linux/kernel.h>
+#include <linux/mfd/syscon.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/of_mdio.h>
+#include <linux/of_net.h>
+#include <linux/platform_device.h>
+#include <linux/property.h>
+#include <linux/phy.h>
+#include <linux/remoteproc/pruss.h>
+#include <linux/pruss_driver.h>
+
+#include "icssg_prueth.h"
+#include "icssg_mii_rt.h"
+#include "../k3-cppi-desc-pool.h"
+
+#define PRUETH_MODULE_DESCRIPTION "PRUSS ICSSG SR1.0 Ethernet driver"
+
+/* SR1: Set buffer sizes for the pools. There are 8 internal queues
+ * implemented in firmware, but only 4 tx channels/threads in the Egress
+ * direction to firmware. Need a high priority queue for management
+ * messages since they shouldn't be blocked even during high traffic
+ * situation. So use Q0-Q2 as data queues and Q3 as management queue
+ * in the max case. However for ease of configuration, use the max
+ * data queue + 1 for management message if we are not using max
+ * case.
+ *
+ * Allocate 4 MTU buffers per data queue. Firmware requires
+ * pool sizes to be set for internal queues. Set the upper 5 queue
+ * pool size to min size of 128 bytes since there are only 3 tx
+ * data channels and management queue requires only minimum buffer.
+ * i.e lower queues are used by driver and highest priority queue
+ * from that is used for management message.
+ */
+
+static int emac_egress_buf_pool_size[] = {
+ PRUETH_EMAC_BUF_POOL_SIZE_SR1, PRUETH_EMAC_BUF_POOL_SIZE_SR1,
+ PRUETH_EMAC_BUF_POOL_SIZE_SR1, PRUETH_EMAC_BUF_POOL_MIN_SIZE_SR1,
+ PRUETH_EMAC_BUF_POOL_MIN_SIZE_SR1, PRUETH_EMAC_BUF_POOL_MIN_SIZE_SR1,
+ PRUETH_EMAC_BUF_POOL_MIN_SIZE_SR1, PRUETH_EMAC_BUF_POOL_MIN_SIZE_SR1
+};
+
+static void icssg_config_sr1(struct prueth *prueth, struct prueth_emac *emac,
+ int slice)
+{
+ struct icssg_sr1_config config;
+ void __iomem *va;
+ int i, index;
+
+ memset(&config, 0, sizeof(config));
+ config.addr_lo = cpu_to_le32(lower_32_bits(prueth->msmcram.pa));
+ config.addr_hi = cpu_to_le32(upper_32_bits(prueth->msmcram.pa));
+ config.rx_flow_id = cpu_to_le32(emac->rx_flow_id_base); /* flow id for host port */
+ config.rx_mgr_flow_id = cpu_to_le32(emac->rx_mgm_flow_id_base); /* for mgm ch */
+ config.rand_seed = cpu_to_le32(get_random_u32());
+
+ for (i = PRUETH_EMAC_BUF_POOL_START_SR1; i < PRUETH_NUM_BUF_POOLS_SR1; i++) {
+ index = i - PRUETH_EMAC_BUF_POOL_START_SR1;
+ config.tx_buf_sz[i] = cpu_to_le32(emac_egress_buf_pool_size[index]);
+ }
+
+ va = prueth->shram.va + slice * ICSSG_CONFIG_OFFSET_SLICE1;
+ memcpy_toio(va, &config, sizeof(config));
+
+ emac->speed = SPEED_1000;
+ emac->duplex = DUPLEX_FULL;
+}
+
+static int emac_send_command_sr1(struct prueth_emac *emac, u32 cmd)
+{
+ struct cppi5_host_desc_t *first_desc;
+ u32 pkt_len = sizeof(emac->cmd_data);
+ __le32 *data = emac->cmd_data;
+ dma_addr_t desc_dma, buf_dma;
+ struct prueth_tx_chn *tx_chn;
+ void **swdata;
+ int ret = 0;
+ u32 *epib;
+
+ netdev_dbg(emac->ndev, "Sending cmd %x\n", cmd);
+
+ /* only one command at a time allowed to firmware */
+ mutex_lock(&emac->cmd_lock);
+ data[0] = cpu_to_le32(cmd);
+
+ /* highest priority channel for management messages */
+ tx_chn = &emac->tx_chns[emac->tx_ch_num - 1];
+
+ /* Map the linear buffer */
+ buf_dma = dma_map_single(tx_chn->dma_dev, data, pkt_len, DMA_TO_DEVICE);
+ if (dma_mapping_error(tx_chn->dma_dev, buf_dma)) {
+ netdev_err(emac->ndev, "cmd %x: failed to map cmd buffer\n", cmd);
+ ret = -EINVAL;
+ goto err_unlock;
+ }
+
+ first_desc = k3_cppi_desc_pool_alloc(tx_chn->desc_pool);
+ if (!first_desc) {
+ netdev_err(emac->ndev, "cmd %x: failed to allocate descriptor\n", cmd);
+ dma_unmap_single(tx_chn->dma_dev, buf_dma, pkt_len, DMA_TO_DEVICE);
+ ret = -ENOMEM;
+ goto err_unlock;
+ }
+
+ cppi5_hdesc_init(first_desc, CPPI5_INFO0_HDESC_EPIB_PRESENT,
+ PRUETH_NAV_PS_DATA_SIZE);
+ cppi5_hdesc_set_pkttype(first_desc, PRUETH_PKT_TYPE_CMD);
+ epib = first_desc->epib;
+ epib[0] = 0;
+ epib[1] = 0;
+
+ cppi5_hdesc_attach_buf(first_desc, buf_dma, pkt_len, buf_dma, pkt_len);
+ swdata = cppi5_hdesc_get_swdata(first_desc);
+ *swdata = data;
+
+ cppi5_hdesc_set_pktlen(first_desc, pkt_len);
+ desc_dma = k3_cppi_desc_pool_virt2dma(tx_chn->desc_pool, first_desc);
+
+ /* send command */
+ reinit_completion(&emac->cmd_complete);
+ ret = k3_udma_glue_push_tx_chn(tx_chn->tx_chn, first_desc, desc_dma);
+ if (ret) {
+ netdev_err(emac->ndev, "cmd %x: push failed: %d\n", cmd, ret);
+ goto free_desc;
+ }
+ ret = wait_for_completion_timeout(&emac->cmd_complete, msecs_to_jiffies(100));
+ if (!ret)
+ netdev_err(emac->ndev, "cmd %x: completion timeout\n", cmd);
+
+ mutex_unlock(&emac->cmd_lock);
+
+ return ret;
+free_desc:
+ prueth_xmit_free(tx_chn, first_desc);
+err_unlock:
+ mutex_unlock(&emac->cmd_lock);
+
+ return ret;
+}
+
+static void icssg_config_set_speed_sr1(struct prueth_emac *emac)
+{
+ u32 cmd = ICSSG_PSTATE_SPEED_DUPLEX_CMD_SR1, val;
+ struct prueth *prueth = emac->prueth;
+ int slice = prueth_emac_slice(emac);
+
+ val = icssg_rgmii_get_speed(prueth->miig_rt, slice);
+ /* firmware expects speed settings in bit 2-1 */
+ val <<= 1;
+ cmd |= val;
+
+ val = icssg_rgmii_get_fullduplex(prueth->miig_rt, slice);
+ /* firmware expects full duplex settings in bit 3 */
+ val <<= 3;
+ cmd |= val;
+
+ emac_send_command_sr1(emac, cmd);
+}
+
+/* called back by PHY layer if there is change in link state of hw port*/
+static void emac_adjust_link_sr1(struct net_device *ndev)
+{
+ struct prueth_emac *emac = netdev_priv(ndev);
+ struct phy_device *phydev = ndev->phydev;
+ struct prueth *prueth = emac->prueth;
+ bool new_state = false;
+ unsigned long flags;
+
+ if (phydev->link) {
+ /* check the mode of operation - full/half duplex */
+ if (phydev->duplex != emac->duplex) {
+ new_state = true;
+ emac->duplex = phydev->duplex;
+ }
+ if (phydev->speed != emac->speed) {
+ new_state = true;
+ emac->speed = phydev->speed;
+ }
+ if (!emac->link) {
+ new_state = true;
+ emac->link = 1;
+ }
+ } else if (emac->link) {
+ new_state = true;
+ emac->link = 0;
+
+ /* f/w should support 100 & 1000 */
+ emac->speed = SPEED_1000;
+
+ /* half duplex may not be supported by f/w */
+ emac->duplex = DUPLEX_FULL;
+ }
+
+ if (new_state) {
+ phy_print_status(phydev);
+
+ /* update RGMII and MII configuration based on PHY negotiated
+ * values
+ */
+ if (emac->link) {
+ /* Set the RGMII cfg for gig en and full duplex */
+ icssg_update_rgmii_cfg(prueth->miig_rt, emac);
+
+ /* update the Tx IPG based on 100M/1G speed */
+ spin_lock_irqsave(&emac->lock, flags);
+ icssg_config_ipg(emac);
+ spin_unlock_irqrestore(&emac->lock, flags);
+ icssg_config_set_speed_sr1(emac);
+ }
+ }
+
+ if (emac->link) {
+ /* reactivate the transmit queue */
+ netif_tx_wake_all_queues(ndev);
+ } else {
+ netif_tx_stop_all_queues(ndev);
+ prueth_cleanup_tx_ts(emac);
+ }
+}
+
+static int emac_phy_connect(struct prueth_emac *emac)
+{
+ struct prueth *prueth = emac->prueth;
+ struct net_device *ndev = emac->ndev;
+ /* connect PHY */
+ ndev->phydev = of_phy_connect(emac->ndev, emac->phy_node,
+ &emac_adjust_link_sr1, 0,
+ emac->phy_if);
+ if (!ndev->phydev) {
+ dev_err(prueth->dev, "couldn't connect to phy %s\n",
+ emac->phy_node->full_name);
+ return -ENODEV;
+ }
+
+ if (!emac->half_duplex) {
+ dev_dbg(prueth->dev, "half duplex mode is not supported\n");
+ phy_remove_link_mode(ndev->phydev, ETHTOOL_LINK_MODE_10baseT_Half_BIT);
+ }
+
+ /* Remove 100Mbits half-duplex due to RGMII misreporting connection
+ * as full duplex */
+ phy_remove_link_mode(ndev->phydev, ETHTOOL_LINK_MODE_100baseT_Half_BIT);
+
+ /* remove unsupported modes */
+ phy_remove_link_mode(ndev->phydev, ETHTOOL_LINK_MODE_1000baseT_Half_BIT);
+ phy_remove_link_mode(ndev->phydev, ETHTOOL_LINK_MODE_Pause_BIT);
+ phy_remove_link_mode(ndev->phydev, ETHTOOL_LINK_MODE_Asym_Pause_BIT);
+
+ if (emac->phy_if == PHY_INTERFACE_MODE_MII)
+ phy_set_max_speed(ndev->phydev, SPEED_100);
+
+ return 0;
+}
+
+/* get one packet from requested flow_id
+ *
+ * Returns skb pointer if packet found else NULL
+ * Caller must free the returned skb.
+ */
+static struct sk_buff *prueth_process_rx_mgm(struct prueth_emac *emac,
+ u32 flow_id)
+{
+ struct prueth_rx_chn *rx_chn = &emac->rx_mgm_chn;
+ struct net_device *ndev = emac->ndev;
+ struct cppi5_host_desc_t *desc_rx;
+ struct sk_buff *skb, *new_skb;
+ dma_addr_t desc_dma, buf_dma;
+ u32 buf_dma_len, pkt_len;
+ void **swdata;
+ int ret;
+
+ ret = k3_udma_glue_pop_rx_chn(rx_chn->rx_chn, flow_id, &desc_dma);
+ if (ret) {
+ if (ret != -ENODATA)
+ netdev_err(ndev, "rx mgm pop: failed: %d\n", ret);
+ return NULL;
+ }
+
+ if (cppi5_desc_is_tdcm(desc_dma)) /* Teardown */
+ return NULL;
+
+ desc_rx = k3_cppi_desc_pool_dma2virt(rx_chn->desc_pool, desc_dma);
+
+ /* Fix FW bug about incorrect PSDATA size */
+ if (cppi5_hdesc_get_psdata_size(desc_rx) != PRUETH_NAV_PS_DATA_SIZE) {
+ cppi5_hdesc_update_psdata_size(desc_rx,
+ PRUETH_NAV_PS_DATA_SIZE);
+ }
+
+ swdata = cppi5_hdesc_get_swdata(desc_rx);
+ skb = *swdata;
+ cppi5_hdesc_get_obuf(desc_rx, &buf_dma, &buf_dma_len);
+ pkt_len = cppi5_hdesc_get_pktlen(desc_rx);
+
+ dma_unmap_single(rx_chn->dma_dev, buf_dma, buf_dma_len, DMA_FROM_DEVICE);
+ k3_cppi_desc_pool_free(rx_chn->desc_pool, desc_rx);
+
+ new_skb = netdev_alloc_skb_ip_align(ndev, PRUETH_MAX_PKT_SIZE);
+ /* if allocation fails we drop the packet but push the
+ * descriptor back to the ring with old skb to prevent a stall
+ */
+ if (!new_skb) {
+ netdev_err(ndev,
+ "skb alloc failed, dropped mgm pkt from flow %d\n",
+ flow_id);
+ new_skb = skb;
+ skb = NULL; /* return NULL */
+ } else {
+ /* return the filled skb */
+ skb_put(skb, pkt_len);
+ }
+
+ /* queue another DMA */
+ ret = prueth_dma_rx_push(emac, new_skb, &emac->rx_mgm_chn);
+ if (WARN_ON(ret < 0))
+ dev_kfree_skb_any(new_skb);
+
+ return skb;
+}
+
+static void prueth_tx_ts_sr1(struct prueth_emac *emac,
+ struct emac_tx_ts_response_sr1 *tsr)
+{
+ struct skb_shared_hwtstamps ssh;
+ u32 hi_ts, lo_ts, cookie;
+ struct sk_buff *skb;
+ u64 ns;
+
+ hi_ts = le32_to_cpu(tsr->hi_ts);
+ lo_ts = le32_to_cpu(tsr->lo_ts);
+
+ ns = (u64)hi_ts << 32 | lo_ts;
+
+ cookie = le32_to_cpu(tsr->cookie);
+ if (cookie >= PRUETH_MAX_TX_TS_REQUESTS) {
+ netdev_dbg(emac->ndev, "Invalid TX TS cookie 0x%x\n",
+ cookie);
+ return;
+ }
+
+ skb = emac->tx_ts_skb[cookie];
+ emac->tx_ts_skb[cookie] = NULL; /* free slot */
+
+ memset(&ssh, 0, sizeof(ssh));
+ ssh.hwtstamp = ns_to_ktime(ns);
+
+ skb_tstamp_tx(skb, &ssh);
+ dev_consume_skb_any(skb);
+}
+
+static irqreturn_t prueth_rx_mgm_ts_thread_sr1(int irq, void *dev_id)
+{
+ struct prueth_emac *emac = dev_id;
+ struct sk_buff *skb;
+
+ skb = prueth_process_rx_mgm(emac, PRUETH_RX_MGM_FLOW_TIMESTAMP_SR1);
+ if (!skb)
+ return IRQ_NONE;
+
+ prueth_tx_ts_sr1(emac, (void *)skb->data);
+ dev_kfree_skb_any(skb);
+
+ return IRQ_HANDLED;
+}
+
+static irqreturn_t prueth_rx_mgm_rsp_thread(int irq, void *dev_id)
+{
+ struct prueth_emac *emac = dev_id;
+ struct sk_buff *skb;
+ u32 rsp;
+
+ skb = prueth_process_rx_mgm(emac, PRUETH_RX_MGM_FLOW_RESPONSE_SR1);
+ if (!skb)
+ return IRQ_NONE;
+
+ /* Process command response */
+ rsp = le32_to_cpu(*(__le32 *)skb->data) & 0xffff0000;
+ if (rsp == ICSSG_SHUTDOWN_CMD_SR1) {
+ netdev_dbg(emac->ndev, "f/w Shutdown cmd resp %x\n", rsp);
+ complete(&emac->cmd_complete);
+ } else if (rsp == ICSSG_PSTATE_SPEED_DUPLEX_CMD_SR1) {
+ netdev_dbg(emac->ndev, "f/w Speed/Duplex cmd rsp %x\n", rsp);
+ complete(&emac->cmd_complete);
+ }
+
+ dev_kfree_skb_any(skb);
+
+ return IRQ_HANDLED;
+}
+
+static struct icssg_firmwares icssg_sr1_emac_firmwares[] = {
+ {
+ .pru = "ti-pruss/am65x-pru0-prueth-fw.elf",
+ .rtu = "ti-pruss/am65x-rtu0-prueth-fw.elf",
+ },
+ {
+ .pru = "ti-pruss/am65x-pru1-prueth-fw.elf",
+ .rtu = "ti-pruss/am65x-rtu1-prueth-fw.elf",
+ }
+};
+
+static int prueth_emac_start(struct prueth *prueth, struct prueth_emac *emac)
+{
+ struct icssg_firmwares *firmwares;
+ struct device *dev = prueth->dev;
+ int slice, ret;
+
+ firmwares = icssg_sr1_emac_firmwares;
+
+ slice = prueth_emac_slice(emac);
+ if (slice < 0) {
+ netdev_err(emac->ndev, "invalid port\n");
+ return -EINVAL;
+ }
+
+ icssg_config_sr1(prueth, emac, slice);
+
+ ret = rproc_set_firmware(prueth->pru[slice], firmwares[slice].pru);
+ ret = rproc_boot(prueth->pru[slice]);
+ if (ret) {
+ dev_err(dev, "failed to boot PRU%d: %d\n", slice, ret);
+ return -EINVAL;
+ }
+
+ ret = rproc_set_firmware(prueth->rtu[slice], firmwares[slice].rtu);
+ ret = rproc_boot(prueth->rtu[slice]);
+ if (ret) {
+ dev_err(dev, "failed to boot RTU%d: %d\n", slice, ret);
+ goto halt_pru;
+ }
+
+ emac->fw_running = 1;
+ return 0;
+
+halt_pru:
+ rproc_shutdown(prueth->pru[slice]);
+
+ return ret;
+}
+
+/**
+ * emac_ndo_open - EMAC device open
+ * @ndev: network adapter device
+ *
+ * Called when system wants to start the interface.
+ *
+ * Return: 0 for a successful open, or appropriate error code
+ */
+static int emac_ndo_open(struct net_device *ndev)
+{
+ struct prueth_emac *emac = netdev_priv(ndev);
+ int num_data_chn = emac->tx_ch_num - 1;
+ struct prueth *prueth = emac->prueth;
+ int slice = prueth_emac_slice(emac);
+ struct device *dev = prueth->dev;
+ int max_rx_flows, rx_flow;
+ int ret, i;
+
+ /* clear SMEM and MSMC settings for all slices */
+ if (!prueth->emacs_initialized) {
+ memset_io(prueth->msmcram.va, 0, prueth->msmcram.size);
+ memset_io(prueth->shram.va, 0, ICSSG_CONFIG_OFFSET_SLICE1 * PRUETH_NUM_MACS);
+ }
+
+ /* set h/w MAC as user might have re-configured */
+ ether_addr_copy(emac->mac_addr, ndev->dev_addr);
+
+ icssg_class_set_mac_addr(prueth->miig_rt, slice, emac->mac_addr);
+
+ icssg_class_default(prueth->miig_rt, slice, 0, true);
+
+ /* Notify the stack of the actual queue counts. */
+ ret = netif_set_real_num_tx_queues(ndev, num_data_chn);
+ if (ret) {
+ dev_err(dev, "cannot set real number of tx queues\n");
+ return ret;
+ }
+
+ init_completion(&emac->cmd_complete);
+ ret = prueth_init_tx_chns(emac);
+ if (ret) {
+ dev_err(dev, "failed to init tx channel: %d\n", ret);
+ return ret;
+ }
+
+ max_rx_flows = PRUETH_MAX_RX_FLOWS_SR1;
+ ret = prueth_init_rx_chns(emac, &emac->rx_chns, "rx",
+ max_rx_flows, PRUETH_MAX_RX_DESC);
+ if (ret) {
+ dev_err(dev, "failed to init rx channel: %d\n", ret);
+ goto cleanup_tx;
+ }
+
+ ret = prueth_init_rx_chns(emac, &emac->rx_mgm_chn, "rxmgm",
+ PRUETH_MAX_RX_MGM_FLOWS_SR1,
+ PRUETH_MAX_RX_MGM_DESC_SR1);
+ if (ret) {
+ dev_err(dev, "failed to init rx mgmt channel: %d\n",
+ ret);
+ goto cleanup_rx;
+ }
+
+ ret = prueth_ndev_add_tx_napi(emac);
+ if (ret)
+ goto cleanup_rx_mgm;
+
+ /* we use only the highest priority flow for now i.e. @irq[3] */
+ rx_flow = PRUETH_RX_FLOW_DATA_SR1;
+ ret = request_irq(emac->rx_chns.irq[rx_flow], prueth_rx_irq,
+ IRQF_TRIGGER_HIGH, dev_name(dev), emac);
+ if (ret) {
+ dev_err(dev, "unable to request RX IRQ\n");
+ goto cleanup_napi;
+ }
+
+ ret = request_threaded_irq(emac->rx_mgm_chn.irq[PRUETH_RX_MGM_FLOW_RESPONSE_SR1],
+ NULL, prueth_rx_mgm_rsp_thread,
+ IRQF_ONESHOT | IRQF_TRIGGER_HIGH,
+ dev_name(dev), emac);
+ if (ret) {
+ dev_err(dev, "unable to request RX Management RSP IRQ\n");
+ goto free_rx_irq;
+ }
+
+ ret = request_threaded_irq(emac->rx_mgm_chn.irq[PRUETH_RX_MGM_FLOW_TIMESTAMP_SR1],
+ NULL, prueth_rx_mgm_ts_thread_sr1,
+ IRQF_ONESHOT | IRQF_TRIGGER_HIGH,
+ dev_name(dev), emac);
+ if (ret) {
+ dev_err(dev, "unable to request RX Management TS IRQ\n");
+ goto free_rx_mgm_rsp_irq;
+ }
+
+ /* reset and start PRU firmware */
+ ret = prueth_emac_start(prueth, emac);
+ if (ret)
+ goto free_rx_mgmt_ts_irq;
+
+ icssg_mii_update_mtu(prueth->mii_rt, slice, ndev->max_mtu);
+
+ /* Prepare RX */
+ ret = prueth_prepare_rx_chan(emac, &emac->rx_chns, PRUETH_MAX_PKT_SIZE);
+ if (ret)
+ goto stop;
+
+ ret = prueth_prepare_rx_chan(emac, &emac->rx_mgm_chn, 64);
+ if (ret)
+ goto reset_rx_chn;
+
+ ret = k3_udma_glue_enable_rx_chn(emac->rx_mgm_chn.rx_chn);
+ if (ret)
+ goto reset_rx_chn;
+
+ ret = k3_udma_glue_enable_rx_chn(emac->rx_chns.rx_chn);
+ if (ret)
+ goto reset_rx_mgm_chn;
+
+ for (i = 0; i < emac->tx_ch_num; i++) {
+ ret = k3_udma_glue_enable_tx_chn(emac->tx_chns[i].tx_chn);
+ if (ret)
+ goto reset_tx_chan;
+ }
+
+ /* Enable NAPI in Tx and Rx direction */
+ for (i = 0; i < emac->tx_ch_num; i++)
+ napi_enable(&emac->tx_chns[i].napi_tx);
+ napi_enable(&emac->napi_rx);
+
+ /* start PHY */
+ phy_start(ndev->phydev);
+
+ prueth->emacs_initialized++;
+
+ queue_work(system_long_wq, &emac->stats_work.work);
+
+ return 0;
+
+reset_tx_chan:
+ /* Since interface is not yet up, there is wouldn't be
+ * any SKB for completion. So set false to free_skb
+ */
+ prueth_reset_tx_chan(emac, i, false);
+reset_rx_mgm_chn:
+ prueth_reset_rx_chan(&emac->rx_mgm_chn,
+ PRUETH_MAX_RX_MGM_FLOWS_SR1, true);
+reset_rx_chn:
+ prueth_reset_rx_chan(&emac->rx_chns, max_rx_flows, false);
+stop:
+ prueth_emac_stop(emac);
+free_rx_mgmt_ts_irq:
+ free_irq(emac->rx_mgm_chn.irq[PRUETH_RX_MGM_FLOW_TIMESTAMP_SR1],
+ emac);
+free_rx_mgm_rsp_irq:
+ free_irq(emac->rx_mgm_chn.irq[PRUETH_RX_MGM_FLOW_RESPONSE_SR1],
+ emac);
+free_rx_irq:
+ free_irq(emac->rx_chns.irq[rx_flow], emac);
+cleanup_napi:
+ prueth_ndev_del_tx_napi(emac, emac->tx_ch_num);
+cleanup_rx_mgm:
+ prueth_cleanup_rx_chns(emac, &emac->rx_mgm_chn,
+ PRUETH_MAX_RX_MGM_FLOWS_SR1);
+cleanup_rx:
+ prueth_cleanup_rx_chns(emac, &emac->rx_chns, max_rx_flows);
+cleanup_tx:
+ prueth_cleanup_tx_chns(emac);
+
+ return ret;
+}
+
+/**
+ * emac_ndo_stop - EMAC device stop
+ * @ndev: network adapter device
+ *
+ * Called when system wants to stop or down the interface.
+ *
+ * Return: Always 0 (Success)
+ */
+static int emac_ndo_stop(struct net_device *ndev)
+{
+ struct prueth_emac *emac = netdev_priv(ndev);
+ int rx_flow = PRUETH_RX_FLOW_DATA_SR1;
+ struct prueth *prueth = emac->prueth;
+ int max_rx_flows;
+ int ret, i;
+
+ /* inform the upper layers. */
+ netif_tx_stop_all_queues(ndev);
+
+ /* block packets from wire */
+ if (ndev->phydev)
+ phy_stop(ndev->phydev);
+
+ icssg_class_disable(prueth->miig_rt, prueth_emac_slice(emac));
+
+ emac_send_command_sr1(emac, ICSSG_SHUTDOWN_CMD_SR1);
+
+ atomic_set(&emac->tdown_cnt, emac->tx_ch_num);
+ /* ensure new tdown_cnt value is visible */
+ smp_mb__after_atomic();
+ /* tear down and disable UDMA channels */
+ reinit_completion(&emac->tdown_complete);
+ for (i = 0; i < emac->tx_ch_num; i++)
+ k3_udma_glue_tdown_tx_chn(emac->tx_chns[i].tx_chn, false);
+
+ ret = wait_for_completion_timeout(&emac->tdown_complete,
+ msecs_to_jiffies(1000));
+ if (!ret)
+ netdev_err(ndev, "tx teardown timeout\n");
+
+ prueth_reset_tx_chan(emac, emac->tx_ch_num, true);
+ for (i = 0; i < emac->tx_ch_num; i++)
+ napi_disable(&emac->tx_chns[i].napi_tx);
+
+ max_rx_flows = PRUETH_MAX_RX_FLOWS_SR1;
+ k3_udma_glue_tdown_rx_chn(emac->rx_chns.rx_chn, true);
+
+ prueth_reset_rx_chan(&emac->rx_chns, max_rx_flows, true);
+ /* Teardown RX MGM channel */
+ k3_udma_glue_tdown_rx_chn(emac->rx_mgm_chn.rx_chn, true);
+ prueth_reset_rx_chan(&emac->rx_mgm_chn,
+ PRUETH_MAX_RX_MGM_FLOWS_SR1, true);
+
+ napi_disable(&emac->napi_rx);
+
+ /* Destroying the queued work in ndo_stop() */
+ cancel_delayed_work_sync(&emac->stats_work);
+
+ /* stop PRUs */
+ prueth_emac_stop(emac);
+
+ free_irq(emac->rx_mgm_chn.irq[PRUETH_RX_MGM_FLOW_TIMESTAMP_SR1], emac);
+ free_irq(emac->rx_mgm_chn.irq[PRUETH_RX_MGM_FLOW_RESPONSE_SR1], emac);
+ free_irq(emac->rx_chns.irq[rx_flow], emac);
+ prueth_ndev_del_tx_napi(emac, emac->tx_ch_num);
+ prueth_cleanup_tx_chns(emac);
+
+ prueth_cleanup_rx_chns(emac, &emac->rx_mgm_chn, PRUETH_MAX_RX_MGM_FLOWS_SR1);
+ prueth_cleanup_rx_chns(emac, &emac->rx_chns, max_rx_flows);
+
+ prueth->emacs_initialized--;
+
+ return 0;
+}
+
+static void emac_ndo_set_rx_mode_sr1(struct net_device *ndev)
+{
+ struct prueth_emac *emac = netdev_priv(ndev);
+ bool allmulti = ndev->flags & IFF_ALLMULTI;
+ bool promisc = ndev->flags & IFF_PROMISC;
+ struct prueth *prueth = emac->prueth;
+ int slice = prueth_emac_slice(emac);
+
+ if (promisc) {
+ icssg_class_promiscuous_sr1(prueth->miig_rt, slice);
+ return;
+ }
+
+ if (allmulti) {
+ icssg_class_default(prueth->miig_rt, slice, 1, true);
+ return;
+ }
+
+ icssg_class_default(prueth->miig_rt, slice, 0, true);
+ if (!netdev_mc_empty(ndev)) {
+ /* program multicast address list into Classifier */
+ icssg_class_add_mcast_sr1(prueth->miig_rt, slice, ndev);
+ }
+}
+
+static const struct net_device_ops emac_netdev_ops = {
+ .ndo_open = emac_ndo_open,
+ .ndo_stop = emac_ndo_stop,
+ .ndo_start_xmit = emac_ndo_start_xmit,
+ .ndo_set_mac_address = eth_mac_addr,
+ .ndo_validate_addr = eth_validate_addr,
+ .ndo_tx_timeout = emac_ndo_tx_timeout,
+ .ndo_set_rx_mode = emac_ndo_set_rx_mode_sr1,
+ .ndo_eth_ioctl = emac_ndo_ioctl,
+ .ndo_get_stats64 = emac_ndo_get_stats64,
+ .ndo_get_phys_port_name = emac_ndo_get_phys_port_name,
+};
+
+static int prueth_netdev_init(struct prueth *prueth,
+ struct device_node *eth_node)
+{
+ struct prueth_emac *emac;
+ struct net_device *ndev;
+ enum prueth_port port;
+ enum prueth_mac mac;
+ /* Only enable one TX channel due to timeouts when
+ * using multiple channels */
+ int num_tx_chn = 1;
+ int ret;
+
+ port = prueth_node_port(eth_node);
+ if (port == PRUETH_PORT_INVALID)
+ return -EINVAL;
+
+ mac = prueth_node_mac(eth_node);
+ if (mac == PRUETH_MAC_INVALID)
+ return -EINVAL;
+
+ ndev = alloc_etherdev_mq(sizeof(*emac), num_tx_chn);
+ if (!ndev)
+ return -ENOMEM;
+
+ emac = netdev_priv(ndev);
+ emac->is_sr1 = 1;
+ emac->prueth = prueth;
+ emac->ndev = ndev;
+ emac->port_id = port;
+ emac->cmd_wq = create_singlethread_workqueue("icssg_cmd_wq");
+ if (!emac->cmd_wq) {
+ ret = -ENOMEM;
+ goto free_ndev;
+ }
+
+ INIT_DELAYED_WORK(&emac->stats_work, emac_stats_work_handler);
+
+ ret = pruss_request_mem_region(prueth->pruss,
+ port == PRUETH_PORT_MII0 ?
+ PRUSS_MEM_DRAM0 : PRUSS_MEM_DRAM1,
+ &emac->dram);
+ if (ret) {
+ dev_err(prueth->dev, "unable to get DRAM: %d\n", ret);
+ ret = -ENOMEM;
+ goto free_wq;
+ }
+
+ /* SR1.0 uses a dedicated high priority channel
+ * to send commands to the firmware
+ */
+ emac->tx_ch_num = 2;
+
+ SET_NETDEV_DEV(ndev, prueth->dev);
+ spin_lock_init(&emac->lock);
+ mutex_init(&emac->cmd_lock);
+
+ emac->phy_node = of_parse_phandle(eth_node, "phy-handle", 0);
+ if (!emac->phy_node && !of_phy_is_fixed_link(eth_node)) {
+ dev_err(prueth->dev, "couldn't find phy-handle\n");
+ ret = -ENODEV;
+ goto free;
+ } else if (of_phy_is_fixed_link(eth_node)) {
+ ret = of_phy_register_fixed_link(eth_node);
+ if (ret) {
+ ret = dev_err_probe(prueth->dev, ret,
+ "failed to register fixed-link phy\n");
+ goto free;
+ }
+
+ emac->phy_node = eth_node;
+ }
+
+ ret = of_get_phy_mode(eth_node, &emac->phy_if);
+ if (ret) {
+ dev_err(prueth->dev, "could not get phy-mode property\n");
+ goto free;
+ }
+
+ if (emac->phy_if != PHY_INTERFACE_MODE_MII &&
+ !phy_interface_mode_is_rgmii(emac->phy_if)) {
+ dev_err(prueth->dev, "PHY mode unsupported %s\n", phy_modes(emac->phy_if));
+ ret = -EINVAL;
+ goto free;
+ }
+
+ /* AM65 SR2.0 has TX Internal delay always enabled by hardware
+ * and it is not possible to disable TX Internal delay. The below
+ * switch case block describes how we handle different phy modes
+ * based on hardware restriction.
+ */
+ switch (emac->phy_if) {
+ case PHY_INTERFACE_MODE_RGMII_ID:
+ emac->phy_if = PHY_INTERFACE_MODE_RGMII_RXID;
+ break;
+ case PHY_INTERFACE_MODE_RGMII_TXID:
+ emac->phy_if = PHY_INTERFACE_MODE_RGMII;
+ break;
+ case PHY_INTERFACE_MODE_RGMII:
+ case PHY_INTERFACE_MODE_RGMII_RXID:
+ dev_err(prueth->dev, "RGMII mode without TX delay is not supported");
+ ret = -EINVAL;
+ goto free;
+ default:
+ break;
+ }
+
+ /* get mac address from DT and set private and netdev addr */
+ ret = of_get_ethdev_address(eth_node, ndev);
+ if (!is_valid_ether_addr(ndev->dev_addr)) {
+ eth_hw_addr_random(ndev);
+ dev_warn(prueth->dev, "port %d: using random MAC addr: %pM\n",
+ port, ndev->dev_addr);
+ }
+ ether_addr_copy(emac->mac_addr, ndev->dev_addr);
+
+ ndev->min_mtu = PRUETH_MIN_PKT_SIZE;
+ ndev->max_mtu = PRUETH_MAX_MTU;
+ ndev->netdev_ops = &emac_netdev_ops;
+ ndev->ethtool_ops = &icssg_ethtool_ops;
+ ndev->hw_features = NETIF_F_SG;
+ ndev->features = ndev->hw_features;
+
+ netif_napi_add(ndev, &emac->napi_rx, emac_napi_rx_poll);
+ prueth->emac[mac] = emac;
+
+ return 0;
+
+free:
+ pruss_release_mem_region(prueth->pruss, &emac->dram);
+free_wq:
+ destroy_workqueue(emac->cmd_wq);
+free_ndev:
+ emac->ndev = NULL;
+ prueth->emac[mac] = NULL;
+ free_netdev(ndev);
+
+ return ret;
+}
+
+static int prueth_probe(struct platform_device *pdev)
+{
+ struct device_node *eth_node, *eth_ports_node;
+ struct device_node *eth0_node = NULL;
+ struct device_node *eth1_node = NULL;
+ struct device *dev = &pdev->dev;
+ struct device_node *np;
+ struct prueth *prueth;
+ struct pruss *pruss;
+ u32 msmc_ram_size;
+ int i, ret;
+
+ np = dev->of_node;
+
+ prueth = devm_kzalloc(dev, sizeof(*prueth), GFP_KERNEL);
+ if (!prueth)
+ return -ENOMEM;
+
+ dev_set_drvdata(dev, prueth);
+ prueth->pdev = pdev;
+ prueth->pdata = *(const struct prueth_pdata *)device_get_match_data(dev);
+
+ prueth->dev = dev;
+ eth_ports_node = of_get_child_by_name(np, "ethernet-ports");
+ if (!eth_ports_node)
+ return -ENOENT;
+
+ for_each_child_of_node(eth_ports_node, eth_node) {
+ u32 reg;
+
+ if (strcmp(eth_node->name, "port"))
+ continue;
+ ret = of_property_read_u32(eth_node, "reg", ®);
+ if (ret < 0) {
+ dev_err(dev, "%pOF error reading port_id %d\n",
+ eth_node, ret);
+ }
+
+ of_node_get(eth_node);
+
+ if (reg == 0) {
+ eth0_node = eth_node;
+ if (!of_device_is_available(eth0_node)) {
+ of_node_put(eth0_node);
+ eth0_node = NULL;
+ }
+ } else if (reg == 1) {
+ eth1_node = eth_node;
+ if (!of_device_is_available(eth1_node)) {
+ of_node_put(eth1_node);
+ eth1_node = NULL;
+ }
+ } else {
+ dev_err(dev, "port reg should be 0 or 1\n");
+ }
+ }
+
+ of_node_put(eth_ports_node);
+
+ /* At least one node must be present and available else we fail */
+ if (!eth0_node && !eth1_node) {
+ dev_err(dev, "neither port0 nor port1 node available\n");
+ return -ENODEV;
+ }
+
+ if (eth0_node == eth1_node) {
+ dev_err(dev, "port0 and port1 can't have same reg\n");
+ of_node_put(eth0_node);
+ return -ENODEV;
+ }
+
+ prueth->eth_node[PRUETH_MAC0] = eth0_node;
+ prueth->eth_node[PRUETH_MAC1] = eth1_node;
+
+ prueth->miig_rt = syscon_regmap_lookup_by_phandle(np, "ti,mii-g-rt");
+ if (IS_ERR(prueth->miig_rt)) {
+ dev_err(dev, "couldn't get ti,mii-g-rt syscon regmap\n");
+ return -ENODEV;
+ }
+
+ prueth->mii_rt = syscon_regmap_lookup_by_phandle(np, "ti,mii-rt");
+ if (IS_ERR(prueth->mii_rt)) {
+ dev_err(dev, "couldn't get ti,mii-rt syscon regmap\n");
+ return -ENODEV;
+ }
+
+ if (eth0_node) {
+ ret = prueth_get_cores(prueth, ICSS_SLICE0, true);
+ if (ret)
+ goto put_cores;
+ }
+
+ if (eth1_node) {
+ ret = prueth_get_cores(prueth, ICSS_SLICE1, true);
+ if (ret)
+ goto put_cores;
+ }
+
+ pruss = pruss_get(eth0_node ?
+ prueth->pru[ICSS_SLICE0] : prueth->pru[ICSS_SLICE1]);
+ if (IS_ERR(pruss)) {
+ ret = PTR_ERR(pruss);
+ dev_err(dev, "unable to get pruss handle\n");
+ goto put_cores;
+ }
+
+ prueth->pruss = pruss;
+
+ ret = pruss_request_mem_region(pruss, PRUSS_MEM_SHRD_RAM2,
+ &prueth->shram);
+ if (ret) {
+ dev_err(dev, "unable to get PRUSS SHRD RAM2: %d\n", ret);
+ goto put_pruss;
+ }
+
+ prueth->sram_pool = of_gen_pool_get(np, "sram", 0);
+ if (!prueth->sram_pool) {
+ dev_err(dev, "unable to get SRAM pool\n");
+ ret = -ENODEV;
+
+ goto put_mem;
+ }
+
+ msmc_ram_size = MSMC_RAM_SIZE_SR1;
+
+ prueth->msmcram.va = (void __iomem *)gen_pool_alloc(prueth->sram_pool,
+ msmc_ram_size);
+
+ if (!prueth->msmcram.va) {
+ ret = -ENOMEM;
+ dev_err(dev, "unable to allocate MSMC resource\n");
+ goto put_mem;
+ }
+ prueth->msmcram.pa = gen_pool_virt_to_phys(prueth->sram_pool,
+ (unsigned long)prueth->msmcram.va);
+ prueth->msmcram.size = msmc_ram_size;
+ memset_io(prueth->msmcram.va, 0, msmc_ram_size);
+ dev_dbg(dev, "sram: pa %llx va %p size %zx\n", prueth->msmcram.pa,
+ prueth->msmcram.va, prueth->msmcram.size);
+
+ if (eth0_node) {
+ ret = prueth_netdev_init(prueth, eth0_node);
+ if (ret) {
+ dev_err_probe(dev, ret, "netdev init %s failed\n",
+ eth0_node->name);
+ goto free_pool;
+ }
+
+ if (of_find_property(eth0_node, "ti,half-duplex-capable", NULL))
+ prueth->emac[PRUETH_MAC0]->half_duplex = 1;
+ }
+
+ if (eth1_node) {
+ ret = prueth_netdev_init(prueth, eth1_node);
+ if (ret) {
+ dev_err_probe(dev, ret, "netdev init %s failed\n",
+ eth1_node->name);
+ goto netdev_exit;
+ }
+
+ if (of_find_property(eth1_node, "ti,half-duplex-capable", NULL))
+ prueth->emac[PRUETH_MAC1]->half_duplex = 1;
+ }
+
+ /* register the network devices */
+ if (eth0_node) {
+ ret = register_netdev(prueth->emac[PRUETH_MAC0]->ndev);
+ if (ret) {
+ dev_err(dev, "can't register netdev for port MII0\n");
+ goto netdev_exit;
+ }
+
+ prueth->registered_netdevs[PRUETH_MAC0] = prueth->emac[PRUETH_MAC0]->ndev;
+ emac_phy_connect(prueth->emac[PRUETH_MAC0]);
+ phy_attached_info(prueth->emac[PRUETH_MAC0]->ndev->phydev);
+ }
+
+ if (eth1_node) {
+ ret = register_netdev(prueth->emac[PRUETH_MAC1]->ndev);
+ if (ret) {
+ dev_err(dev, "can't register netdev for port MII1\n");
+ goto netdev_unregister;
+ }
+
+ prueth->registered_netdevs[PRUETH_MAC1] = prueth->emac[PRUETH_MAC1]->ndev;
+ emac_phy_connect(prueth->emac[PRUETH_MAC1]);
+ phy_attached_info(prueth->emac[PRUETH_MAC1]->ndev->phydev);
+ }
+
+ dev_info(dev, "TI PRU SR1.0 ethernet driver initialized: %s EMAC mode\n",
+ (!eth0_node || !eth1_node) ? "single" : "dual");
+
+ if (eth1_node)
+ of_node_put(eth1_node);
+ if (eth0_node)
+ of_node_put(eth0_node);
+
+ return 0;
+
+netdev_unregister:
+ for (i = 0; i < PRUETH_NUM_MACS; i++) {
+ if (!prueth->registered_netdevs[i])
+ continue;
+
+ if (prueth->emac[i]->ndev->phydev) {
+ phy_disconnect(prueth->emac[i]->ndev->phydev);
+ prueth->emac[i]->ndev->phydev = NULL;
+ }
+ unregister_netdev(prueth->registered_netdevs[i]);
+ }
+
+netdev_exit:
+ for (i = 0; i < PRUETH_NUM_MACS; i++) {
+ eth_node = prueth->eth_node[i];
+ if (!eth_node)
+ continue;
+
+ prueth_netdev_exit(prueth, eth_node);
+ }
+
+free_pool:
+ gen_pool_free(prueth->sram_pool,
+ (unsigned long)prueth->msmcram.va, msmc_ram_size);
+
+put_mem:
+ pruss_release_mem_region(prueth->pruss, &prueth->shram);
+
+put_pruss:
+ pruss_put(prueth->pruss);
+
+put_cores:
+ if (eth1_node) {
+ prueth_put_cores(prueth, ICSS_SLICE1);
+ of_node_put(eth1_node);
+ }
+
+ if (eth0_node) {
+ prueth_put_cores(prueth, ICSS_SLICE0);
+ of_node_put(eth0_node);
+ }
+
+ return ret;
+}
+
+static void prueth_remove(struct platform_device *pdev)
+{
+ struct prueth *prueth = platform_get_drvdata(pdev);
+ struct device_node *eth_node;
+ int i;
+
+ for (i = 0; i < PRUETH_NUM_MACS; i++) {
+ if (!prueth->registered_netdevs[i])
+ continue;
+ phy_stop(prueth->emac[i]->ndev->phydev);
+ phy_disconnect(prueth->emac[i]->ndev->phydev);
+ prueth->emac[i]->ndev->phydev = NULL;
+ unregister_netdev(prueth->registered_netdevs[i]);
+ }
+
+ for (i = 0; i < PRUETH_NUM_MACS; i++) {
+ eth_node = prueth->eth_node[i];
+ if (!eth_node)
+ continue;
+
+ prueth_netdev_exit(prueth, eth_node);
+ }
+
+ gen_pool_free(prueth->sram_pool,
+ (unsigned long)prueth->msmcram.va,
+ MSMC_RAM_SIZE_SR1);
+
+ pruss_release_mem_region(prueth->pruss, &prueth->shram);
+
+ pruss_put(prueth->pruss);
+
+ if (prueth->eth_node[PRUETH_MAC1])
+ prueth_put_cores(prueth, ICSS_SLICE1);
+
+ if (prueth->eth_node[PRUETH_MAC0])
+ prueth_put_cores(prueth, ICSS_SLICE0);
+}
+
+static const struct prueth_pdata am654_sr1_icssg_pdata = {
+ .fdqring_mode = K3_RINGACC_RING_MODE_MESSAGE,
+};
+
+static const struct of_device_id prueth_dt_match[] = {
+ { .compatible = "ti,am654-sr1-icssg-prueth", .data = &am654_sr1_icssg_pdata },
+ { /* sentinel */ }
+};
+MODULE_DEVICE_TABLE(of, prueth_dt_match);
+
+static struct platform_driver prueth_driver = {
+ .probe = prueth_probe,
+ .remove_new = prueth_remove,
+ .driver = {
+ .name = "icssg-prueth-sr1",
+ .of_match_table = prueth_dt_match,
+ .pm = &prueth_dev_pm_ops,
+ },
+};
+module_platform_driver(prueth_driver);
+
+MODULE_AUTHOR("Roger Quadros <rogerq@ti.com>");
+MODULE_AUTHOR("Md Danish Anwar <danishanwar@ti.com>");
+MODULE_AUTHOR("Diogo Ivo <diogo.ivo@siemens.com>");
+MODULE_DESCRIPTION(PRUETH_MODULE_DESCRIPTION);
+MODULE_LICENSE("GPL");
--
2.44.0
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related
* [PATCH net-next v6 03/10] net: ti: icssg-prueth: Move common functions into a separate file
From: Diogo Ivo @ 2024-04-03 10:48 UTC (permalink / raw)
To: davem, edumazet, kuba, pabeni, danishanwar, rogerq, andrew,
vigneshr, wsa+renesas, dan.carpenter, netdev, linux-arm-kernel
Cc: Diogo Ivo, jan.kiszka
In-Reply-To: <20240403104821.283832-1-diogo.ivo@siemens.com>
In order to allow code sharing between Silicon Revisions 1.0 and 2.0
move all functions that can be shared into a common file. This commit
introduces no functional changes.
Signed-off-by: Diogo Ivo <diogo.ivo@siemens.com>
Reviewed-by: MD Danish Anwar <danishanwar@ti.com>
---
Changes in v5:
- Added Reviewed-by tag from Danish
drivers/net/ethernet/ti/Makefile | 1 +
drivers/net/ethernet/ti/icssg/icssg_common.c | 1198 ++++++++++++++++++
drivers/net/ethernet/ti/icssg/icssg_prueth.c | 1183 -----------------
drivers/net/ethernet/ti/icssg/icssg_prueth.h | 59 +
4 files changed, 1258 insertions(+), 1183 deletions(-)
create mode 100644 drivers/net/ethernet/ti/icssg/icssg_common.c
diff --git a/drivers/net/ethernet/ti/Makefile b/drivers/net/ethernet/ti/Makefile
index d8590304f3df..4876f20aa495 100644
--- a/drivers/net/ethernet/ti/Makefile
+++ b/drivers/net/ethernet/ti/Makefile
@@ -33,6 +33,7 @@ obj-$(CONFIG_TI_K3_AM65_CPTS) += am65-cpts.o
obj-$(CONFIG_TI_ICSSG_PRUETH) += icssg-prueth.o
icssg-prueth-y := icssg/icssg_prueth.o \
+ icssg/icssg_common.o \
icssg/icssg_classifier.o \
icssg/icssg_queues.o \
icssg/icssg_config.o \
diff --git a/drivers/net/ethernet/ti/icssg/icssg_common.c b/drivers/net/ethernet/ti/icssg/icssg_common.c
new file mode 100644
index 000000000000..99f27ecc9352
--- /dev/null
+++ b/drivers/net/ethernet/ti/icssg/icssg_common.c
@@ -0,0 +1,1198 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/* Texas Instruments ICSSG Ethernet Driver
+ *
+ * Copyright (C) 2018-2022 Texas Instruments Incorporated - https://www.ti.com/
+ * Copyright (C) Siemens AG, 2024
+ *
+ */
+
+#include <linux/dma-mapping.h>
+#include <linux/dma/ti-cppi5.h>
+#include <linux/etherdevice.h>
+#include <linux/interrupt.h>
+#include <linux/kernel.h>
+#include <linux/of.h>
+#include <linux/of_mdio.h>
+#include <linux/phy.h>
+#include <linux/remoteproc/pruss.h>
+#include <linux/regmap.h>
+#include <linux/remoteproc.h>
+
+#include "icssg_prueth.h"
+#include "../k3-cppi-desc-pool.h"
+
+/* Netif debug messages possible */
+#define PRUETH_EMAC_DEBUG (NETIF_MSG_DRV | \
+ NETIF_MSG_PROBE | \
+ NETIF_MSG_LINK | \
+ NETIF_MSG_TIMER | \
+ NETIF_MSG_IFDOWN | \
+ NETIF_MSG_IFUP | \
+ NETIF_MSG_RX_ERR | \
+ NETIF_MSG_TX_ERR | \
+ NETIF_MSG_TX_QUEUED | \
+ NETIF_MSG_INTR | \
+ NETIF_MSG_TX_DONE | \
+ NETIF_MSG_RX_STATUS | \
+ NETIF_MSG_PKTDATA | \
+ NETIF_MSG_HW | \
+ NETIF_MSG_WOL)
+
+#define prueth_napi_to_emac(napi) container_of(napi, struct prueth_emac, napi_rx)
+
+void prueth_cleanup_rx_chns(struct prueth_emac *emac,
+ struct prueth_rx_chn *rx_chn,
+ int max_rflows)
+{
+ if (rx_chn->desc_pool)
+ k3_cppi_desc_pool_destroy(rx_chn->desc_pool);
+
+ if (rx_chn->rx_chn)
+ k3_udma_glue_release_rx_chn(rx_chn->rx_chn);
+}
+
+void prueth_cleanup_tx_chns(struct prueth_emac *emac)
+{
+ int i;
+
+ for (i = 0; i < emac->tx_ch_num; i++) {
+ struct prueth_tx_chn *tx_chn = &emac->tx_chns[i];
+
+ if (tx_chn->desc_pool)
+ k3_cppi_desc_pool_destroy(tx_chn->desc_pool);
+
+ if (tx_chn->tx_chn)
+ k3_udma_glue_release_tx_chn(tx_chn->tx_chn);
+
+ /* Assume prueth_cleanup_tx_chns() is called at the
+ * end after all channel resources are freed
+ */
+ memset(tx_chn, 0, sizeof(*tx_chn));
+ }
+}
+
+void prueth_ndev_del_tx_napi(struct prueth_emac *emac, int num)
+{
+ int i;
+
+ for (i = 0; i < num; i++) {
+ struct prueth_tx_chn *tx_chn = &emac->tx_chns[i];
+
+ if (tx_chn->irq)
+ free_irq(tx_chn->irq, tx_chn);
+ netif_napi_del(&tx_chn->napi_tx);
+ }
+}
+
+void prueth_xmit_free(struct prueth_tx_chn *tx_chn,
+ struct cppi5_host_desc_t *desc)
+{
+ struct cppi5_host_desc_t *first_desc, *next_desc;
+ dma_addr_t buf_dma, next_desc_dma;
+ u32 buf_dma_len;
+
+ first_desc = desc;
+ next_desc = first_desc;
+
+ cppi5_hdesc_get_obuf(first_desc, &buf_dma, &buf_dma_len);
+ k3_udma_glue_tx_cppi5_to_dma_addr(tx_chn->tx_chn, &buf_dma);
+
+ dma_unmap_single(tx_chn->dma_dev, buf_dma, buf_dma_len,
+ DMA_TO_DEVICE);
+
+ next_desc_dma = cppi5_hdesc_get_next_hbdesc(first_desc);
+ k3_udma_glue_tx_cppi5_to_dma_addr(tx_chn->tx_chn, &next_desc_dma);
+ while (next_desc_dma) {
+ next_desc = k3_cppi_desc_pool_dma2virt(tx_chn->desc_pool,
+ next_desc_dma);
+ cppi5_hdesc_get_obuf(next_desc, &buf_dma, &buf_dma_len);
+ k3_udma_glue_tx_cppi5_to_dma_addr(tx_chn->tx_chn, &buf_dma);
+
+ dma_unmap_page(tx_chn->dma_dev, buf_dma, buf_dma_len,
+ DMA_TO_DEVICE);
+
+ next_desc_dma = cppi5_hdesc_get_next_hbdesc(next_desc);
+ k3_udma_glue_tx_cppi5_to_dma_addr(tx_chn->tx_chn, &next_desc_dma);
+
+ k3_cppi_desc_pool_free(tx_chn->desc_pool, next_desc);
+ }
+
+ k3_cppi_desc_pool_free(tx_chn->desc_pool, first_desc);
+}
+
+int emac_tx_complete_packets(struct prueth_emac *emac, int chn,
+ int budget)
+{
+ struct net_device *ndev = emac->ndev;
+ struct cppi5_host_desc_t *desc_tx;
+ struct netdev_queue *netif_txq;
+ struct prueth_tx_chn *tx_chn;
+ unsigned int total_bytes = 0;
+ struct sk_buff *skb;
+ dma_addr_t desc_dma;
+ int res, num_tx = 0;
+ void **swdata;
+
+ tx_chn = &emac->tx_chns[chn];
+
+ while (true) {
+ res = k3_udma_glue_pop_tx_chn(tx_chn->tx_chn, &desc_dma);
+ if (res == -ENODATA)
+ break;
+
+ /* teardown completion */
+ if (cppi5_desc_is_tdcm(desc_dma)) {
+ if (atomic_dec_and_test(&emac->tdown_cnt))
+ complete(&emac->tdown_complete);
+ break;
+ }
+
+ desc_tx = k3_cppi_desc_pool_dma2virt(tx_chn->desc_pool,
+ desc_dma);
+ swdata = cppi5_hdesc_get_swdata(desc_tx);
+
+ skb = *(swdata);
+ prueth_xmit_free(tx_chn, desc_tx);
+
+ ndev = skb->dev;
+ ndev->stats.tx_packets++;
+ ndev->stats.tx_bytes += skb->len;
+ total_bytes += skb->len;
+ napi_consume_skb(skb, budget);
+ num_tx++;
+ }
+
+ if (!num_tx)
+ return 0;
+
+ netif_txq = netdev_get_tx_queue(ndev, chn);
+ netdev_tx_completed_queue(netif_txq, num_tx, total_bytes);
+
+ if (netif_tx_queue_stopped(netif_txq)) {
+ /* If the TX queue was stopped, wake it now
+ * if we have enough room.
+ */
+ __netif_tx_lock(netif_txq, smp_processor_id());
+ if (netif_running(ndev) &&
+ (k3_cppi_desc_pool_avail(tx_chn->desc_pool) >=
+ MAX_SKB_FRAGS))
+ netif_tx_wake_queue(netif_txq);
+ __netif_tx_unlock(netif_txq);
+ }
+
+ return num_tx;
+}
+
+static int emac_napi_tx_poll(struct napi_struct *napi_tx, int budget)
+{
+ struct prueth_tx_chn *tx_chn = prueth_napi_to_tx_chn(napi_tx);
+ struct prueth_emac *emac = tx_chn->emac;
+ int num_tx_packets;
+
+ num_tx_packets = emac_tx_complete_packets(emac, tx_chn->id, budget);
+
+ if (num_tx_packets >= budget)
+ return budget;
+
+ if (napi_complete_done(napi_tx, num_tx_packets))
+ enable_irq(tx_chn->irq);
+
+ return num_tx_packets;
+}
+
+static irqreturn_t prueth_tx_irq(int irq, void *dev_id)
+{
+ struct prueth_tx_chn *tx_chn = dev_id;
+
+ disable_irq_nosync(irq);
+ napi_schedule(&tx_chn->napi_tx);
+
+ return IRQ_HANDLED;
+}
+
+int prueth_ndev_add_tx_napi(struct prueth_emac *emac)
+{
+ struct prueth *prueth = emac->prueth;
+ int i, ret;
+
+ for (i = 0; i < emac->tx_ch_num; i++) {
+ struct prueth_tx_chn *tx_chn = &emac->tx_chns[i];
+
+ netif_napi_add_tx(emac->ndev, &tx_chn->napi_tx, emac_napi_tx_poll);
+ ret = request_irq(tx_chn->irq, prueth_tx_irq,
+ IRQF_TRIGGER_HIGH, tx_chn->name,
+ tx_chn);
+ if (ret) {
+ netif_napi_del(&tx_chn->napi_tx);
+ dev_err(prueth->dev, "unable to request TX IRQ %d\n",
+ tx_chn->irq);
+ goto fail;
+ }
+ }
+
+ return 0;
+fail:
+ prueth_ndev_del_tx_napi(emac, i);
+ return ret;
+}
+
+int prueth_init_tx_chns(struct prueth_emac *emac)
+{
+ static const struct k3_ring_cfg ring_cfg = {
+ .elm_size = K3_RINGACC_RING_ELSIZE_8,
+ .mode = K3_RINGACC_RING_MODE_RING,
+ .flags = 0,
+ .size = PRUETH_MAX_TX_DESC,
+ };
+ struct k3_udma_glue_tx_channel_cfg tx_cfg;
+ struct device *dev = emac->prueth->dev;
+ struct net_device *ndev = emac->ndev;
+ int ret, slice, i;
+ u32 hdesc_size;
+
+ slice = prueth_emac_slice(emac);
+ if (slice < 0)
+ return slice;
+
+ init_completion(&emac->tdown_complete);
+
+ hdesc_size = cppi5_hdesc_calc_size(true, PRUETH_NAV_PS_DATA_SIZE,
+ PRUETH_NAV_SW_DATA_SIZE);
+ memset(&tx_cfg, 0, sizeof(tx_cfg));
+ tx_cfg.swdata_size = PRUETH_NAV_SW_DATA_SIZE;
+ tx_cfg.tx_cfg = ring_cfg;
+ tx_cfg.txcq_cfg = ring_cfg;
+
+ for (i = 0; i < emac->tx_ch_num; i++) {
+ struct prueth_tx_chn *tx_chn = &emac->tx_chns[i];
+
+ /* To differentiate channels for SLICE0 vs SLICE1 */
+ snprintf(tx_chn->name, sizeof(tx_chn->name),
+ "tx%d-%d", slice, i);
+
+ tx_chn->emac = emac;
+ tx_chn->id = i;
+ tx_chn->descs_num = PRUETH_MAX_TX_DESC;
+
+ tx_chn->tx_chn =
+ k3_udma_glue_request_tx_chn(dev, tx_chn->name,
+ &tx_cfg);
+ if (IS_ERR(tx_chn->tx_chn)) {
+ ret = PTR_ERR(tx_chn->tx_chn);
+ tx_chn->tx_chn = NULL;
+ netdev_err(ndev,
+ "Failed to request tx dma ch: %d\n", ret);
+ goto fail;
+ }
+
+ tx_chn->dma_dev = k3_udma_glue_tx_get_dma_device(tx_chn->tx_chn);
+ tx_chn->desc_pool =
+ k3_cppi_desc_pool_create_name(tx_chn->dma_dev,
+ tx_chn->descs_num,
+ hdesc_size,
+ tx_chn->name);
+ if (IS_ERR(tx_chn->desc_pool)) {
+ ret = PTR_ERR(tx_chn->desc_pool);
+ tx_chn->desc_pool = NULL;
+ netdev_err(ndev, "Failed to create tx pool: %d\n", ret);
+ goto fail;
+ }
+
+ ret = k3_udma_glue_tx_get_irq(tx_chn->tx_chn);
+ if (ret < 0) {
+ netdev_err(ndev, "failed to get tx irq\n");
+ goto fail;
+ }
+ tx_chn->irq = ret;
+
+ snprintf(tx_chn->name, sizeof(tx_chn->name), "%s-tx%d",
+ dev_name(dev), tx_chn->id);
+ }
+
+ return 0;
+
+fail:
+ prueth_cleanup_tx_chns(emac);
+ return ret;
+}
+
+int prueth_init_rx_chns(struct prueth_emac *emac,
+ struct prueth_rx_chn *rx_chn,
+ char *name, u32 max_rflows,
+ u32 max_desc_num)
+{
+ struct k3_udma_glue_rx_channel_cfg rx_cfg;
+ struct device *dev = emac->prueth->dev;
+ struct net_device *ndev = emac->ndev;
+ u32 fdqring_id, hdesc_size;
+ int i, ret = 0, slice;
+
+ slice = prueth_emac_slice(emac);
+ if (slice < 0)
+ return slice;
+
+ /* To differentiate channels for SLICE0 vs SLICE1 */
+ snprintf(rx_chn->name, sizeof(rx_chn->name), "%s%d", name, slice);
+
+ hdesc_size = cppi5_hdesc_calc_size(true, PRUETH_NAV_PS_DATA_SIZE,
+ PRUETH_NAV_SW_DATA_SIZE);
+ memset(&rx_cfg, 0, sizeof(rx_cfg));
+ rx_cfg.swdata_size = PRUETH_NAV_SW_DATA_SIZE;
+ rx_cfg.flow_id_num = max_rflows;
+ rx_cfg.flow_id_base = -1; /* udmax will auto select flow id base */
+
+ /* init all flows */
+ rx_chn->dev = dev;
+ rx_chn->descs_num = max_desc_num;
+
+ rx_chn->rx_chn = k3_udma_glue_request_rx_chn(dev, rx_chn->name,
+ &rx_cfg);
+ if (IS_ERR(rx_chn->rx_chn)) {
+ ret = PTR_ERR(rx_chn->rx_chn);
+ rx_chn->rx_chn = NULL;
+ netdev_err(ndev, "Failed to request rx dma ch: %d\n", ret);
+ goto fail;
+ }
+
+ rx_chn->dma_dev = k3_udma_glue_rx_get_dma_device(rx_chn->rx_chn);
+ rx_chn->desc_pool = k3_cppi_desc_pool_create_name(rx_chn->dma_dev,
+ rx_chn->descs_num,
+ hdesc_size,
+ rx_chn->name);
+ if (IS_ERR(rx_chn->desc_pool)) {
+ ret = PTR_ERR(rx_chn->desc_pool);
+ rx_chn->desc_pool = NULL;
+ netdev_err(ndev, "Failed to create rx pool: %d\n", ret);
+ goto fail;
+ }
+
+ emac->rx_flow_id_base = k3_udma_glue_rx_get_flow_id_base(rx_chn->rx_chn);
+ netdev_dbg(ndev, "flow id base = %d\n", emac->rx_flow_id_base);
+
+ fdqring_id = K3_RINGACC_RING_ID_ANY;
+ for (i = 0; i < rx_cfg.flow_id_num; i++) {
+ struct k3_ring_cfg rxring_cfg = {
+ .elm_size = K3_RINGACC_RING_ELSIZE_8,
+ .mode = K3_RINGACC_RING_MODE_RING,
+ .flags = 0,
+ };
+ struct k3_ring_cfg fdqring_cfg = {
+ .elm_size = K3_RINGACC_RING_ELSIZE_8,
+ .flags = K3_RINGACC_RING_SHARED,
+ };
+ struct k3_udma_glue_rx_flow_cfg rx_flow_cfg = {
+ .rx_cfg = rxring_cfg,
+ .rxfdq_cfg = fdqring_cfg,
+ .ring_rxq_id = K3_RINGACC_RING_ID_ANY,
+ .src_tag_lo_sel =
+ K3_UDMA_GLUE_SRC_TAG_LO_USE_REMOTE_SRC_TAG,
+ };
+
+ rx_flow_cfg.ring_rxfdq0_id = fdqring_id;
+ rx_flow_cfg.rx_cfg.size = max_desc_num;
+ rx_flow_cfg.rxfdq_cfg.size = max_desc_num;
+ rx_flow_cfg.rxfdq_cfg.mode = emac->prueth->pdata.fdqring_mode;
+
+ ret = k3_udma_glue_rx_flow_init(rx_chn->rx_chn,
+ i, &rx_flow_cfg);
+ if (ret) {
+ netdev_err(ndev, "Failed to init rx flow%d %d\n",
+ i, ret);
+ goto fail;
+ }
+ if (!i)
+ fdqring_id = k3_udma_glue_rx_flow_get_fdq_id(rx_chn->rx_chn,
+ i);
+ rx_chn->irq[i] = k3_udma_glue_rx_get_irq(rx_chn->rx_chn, i);
+ if (rx_chn->irq[i] <= 0) {
+ ret = rx_chn->irq[i];
+ netdev_err(ndev, "Failed to get rx dma irq");
+ goto fail;
+ }
+ }
+
+ return 0;
+
+fail:
+ prueth_cleanup_rx_chns(emac, rx_chn, max_rflows);
+ return ret;
+}
+
+int prueth_dma_rx_push(struct prueth_emac *emac,
+ struct sk_buff *skb,
+ struct prueth_rx_chn *rx_chn)
+{
+ struct net_device *ndev = emac->ndev;
+ struct cppi5_host_desc_t *desc_rx;
+ u32 pkt_len = skb_tailroom(skb);
+ dma_addr_t desc_dma;
+ dma_addr_t buf_dma;
+ void **swdata;
+
+ desc_rx = k3_cppi_desc_pool_alloc(rx_chn->desc_pool);
+ if (!desc_rx) {
+ netdev_err(ndev, "rx push: failed to allocate descriptor\n");
+ return -ENOMEM;
+ }
+ desc_dma = k3_cppi_desc_pool_virt2dma(rx_chn->desc_pool, desc_rx);
+
+ buf_dma = dma_map_single(rx_chn->dma_dev, skb->data, pkt_len, DMA_FROM_DEVICE);
+ if (unlikely(dma_mapping_error(rx_chn->dma_dev, buf_dma))) {
+ k3_cppi_desc_pool_free(rx_chn->desc_pool, desc_rx);
+ netdev_err(ndev, "rx push: failed to map rx pkt buffer\n");
+ return -EINVAL;
+ }
+
+ cppi5_hdesc_init(desc_rx, CPPI5_INFO0_HDESC_EPIB_PRESENT,
+ PRUETH_NAV_PS_DATA_SIZE);
+ k3_udma_glue_rx_dma_to_cppi5_addr(rx_chn->rx_chn, &buf_dma);
+ cppi5_hdesc_attach_buf(desc_rx, buf_dma, skb_tailroom(skb), buf_dma, skb_tailroom(skb));
+
+ swdata = cppi5_hdesc_get_swdata(desc_rx);
+ *swdata = skb;
+
+ return k3_udma_glue_push_rx_chn(rx_chn->rx_chn, 0,
+ desc_rx, desc_dma);
+}
+
+u64 icssg_ts_to_ns(u32 hi_sw, u32 hi, u32 lo, u32 cycle_time_ns)
+{
+ u32 iepcount_lo, iepcount_hi, hi_rollover_count;
+ u64 ns;
+
+ iepcount_lo = lo & GENMASK(19, 0);
+ iepcount_hi = (hi & GENMASK(11, 0)) << 12 | lo >> 20;
+ hi_rollover_count = hi >> 11;
+
+ ns = ((u64)hi_rollover_count) << 23 | (iepcount_hi + hi_sw);
+ ns = ns * cycle_time_ns + iepcount_lo;
+
+ return ns;
+}
+
+void emac_rx_timestamp(struct prueth_emac *emac,
+ struct sk_buff *skb, u32 *psdata)
+{
+ struct skb_shared_hwtstamps *ssh;
+ u64 ns;
+
+ u32 hi_sw = readl(emac->prueth->shram.va +
+ TIMESYNC_FW_WC_COUNT_HI_SW_OFFSET_OFFSET);
+ ns = icssg_ts_to_ns(hi_sw, psdata[1], psdata[0],
+ IEP_DEFAULT_CYCLE_TIME_NS);
+
+ ssh = skb_hwtstamps(skb);
+ memset(ssh, 0, sizeof(*ssh));
+ ssh->hwtstamp = ns_to_ktime(ns);
+}
+
+static int emac_rx_packet(struct prueth_emac *emac, u32 flow_id)
+{
+ struct prueth_rx_chn *rx_chn = &emac->rx_chns;
+ u32 buf_dma_len, pkt_len, port_id = 0;
+ struct net_device *ndev = emac->ndev;
+ struct cppi5_host_desc_t *desc_rx;
+ struct sk_buff *skb, *new_skb;
+ dma_addr_t desc_dma, buf_dma;
+ void **swdata;
+ u32 *psdata;
+ int ret;
+
+ ret = k3_udma_glue_pop_rx_chn(rx_chn->rx_chn, flow_id, &desc_dma);
+ if (ret) {
+ if (ret != -ENODATA)
+ netdev_err(ndev, "rx pop: failed: %d\n", ret);
+ return ret;
+ }
+
+ if (cppi5_desc_is_tdcm(desc_dma)) /* Teardown ? */
+ return 0;
+
+ desc_rx = k3_cppi_desc_pool_dma2virt(rx_chn->desc_pool, desc_dma);
+
+ swdata = cppi5_hdesc_get_swdata(desc_rx);
+ skb = *swdata;
+
+ psdata = cppi5_hdesc_get_psdata(desc_rx);
+ /* RX HW timestamp */
+ if (emac->rx_ts_enabled)
+ emac_rx_timestamp(emac, skb, psdata);
+
+ cppi5_hdesc_get_obuf(desc_rx, &buf_dma, &buf_dma_len);
+ k3_udma_glue_rx_cppi5_to_dma_addr(rx_chn->rx_chn, &buf_dma);
+ pkt_len = cppi5_hdesc_get_pktlen(desc_rx);
+ /* firmware adds 4 CRC bytes, strip them */
+ pkt_len -= 4;
+ cppi5_desc_get_tags_ids(&desc_rx->hdr, &port_id, NULL);
+
+ dma_unmap_single(rx_chn->dma_dev, buf_dma, buf_dma_len, DMA_FROM_DEVICE);
+ k3_cppi_desc_pool_free(rx_chn->desc_pool, desc_rx);
+
+ skb->dev = ndev;
+ new_skb = netdev_alloc_skb_ip_align(ndev, PRUETH_MAX_PKT_SIZE);
+ /* if allocation fails we drop the packet but push the
+ * descriptor back to the ring with old skb to prevent a stall
+ */
+ if (!new_skb) {
+ ndev->stats.rx_dropped++;
+ new_skb = skb;
+ } else {
+ /* send the filled skb up the n/w stack */
+ skb_put(skb, pkt_len);
+ skb->protocol = eth_type_trans(skb, ndev);
+ napi_gro_receive(&emac->napi_rx, skb);
+ ndev->stats.rx_bytes += pkt_len;
+ ndev->stats.rx_packets++;
+ }
+
+ /* queue another RX DMA */
+ ret = prueth_dma_rx_push(emac, new_skb, &emac->rx_chns);
+ if (WARN_ON(ret < 0)) {
+ dev_kfree_skb_any(new_skb);
+ ndev->stats.rx_errors++;
+ ndev->stats.rx_dropped++;
+ }
+
+ return ret;
+}
+
+static void prueth_rx_cleanup(void *data, dma_addr_t desc_dma)
+{
+ struct prueth_rx_chn *rx_chn = data;
+ struct cppi5_host_desc_t *desc_rx;
+ struct sk_buff *skb;
+ dma_addr_t buf_dma;
+ u32 buf_dma_len;
+ void **swdata;
+
+ desc_rx = k3_cppi_desc_pool_dma2virt(rx_chn->desc_pool, desc_dma);
+ swdata = cppi5_hdesc_get_swdata(desc_rx);
+ skb = *swdata;
+ cppi5_hdesc_get_obuf(desc_rx, &buf_dma, &buf_dma_len);
+ k3_udma_glue_rx_cppi5_to_dma_addr(rx_chn->rx_chn, &buf_dma);
+
+ dma_unmap_single(rx_chn->dma_dev, buf_dma, buf_dma_len,
+ DMA_FROM_DEVICE);
+ k3_cppi_desc_pool_free(rx_chn->desc_pool, desc_rx);
+
+ dev_kfree_skb_any(skb);
+}
+
+static int prueth_tx_ts_cookie_get(struct prueth_emac *emac)
+{
+ int i;
+
+ /* search and get the next free slot */
+ for (i = 0; i < PRUETH_MAX_TX_TS_REQUESTS; i++) {
+ if (!emac->tx_ts_skb[i]) {
+ emac->tx_ts_skb[i] = ERR_PTR(-EBUSY); /* reserve slot */
+ return i;
+ }
+ }
+
+ return -EBUSY;
+}
+
+/**
+ * emac_ndo_start_xmit - EMAC Transmit function
+ * @skb: SKB pointer
+ * @ndev: EMAC network adapter
+ *
+ * Called by the system to transmit a packet - we queue the packet in
+ * EMAC hardware transmit queue
+ * Doesn't wait for completion we'll check for TX completion in
+ * emac_tx_complete_packets().
+ *
+ * Return: enum netdev_tx
+ */
+enum netdev_tx emac_ndo_start_xmit(struct sk_buff *skb, struct net_device *ndev)
+{
+ struct cppi5_host_desc_t *first_desc, *next_desc, *cur_desc;
+ struct prueth_emac *emac = netdev_priv(ndev);
+ struct netdev_queue *netif_txq;
+ struct prueth_tx_chn *tx_chn;
+ dma_addr_t desc_dma, buf_dma;
+ int i, ret = 0, q_idx;
+ bool in_tx_ts = 0;
+ int tx_ts_cookie;
+ void **swdata;
+ u32 pkt_len;
+ u32 *epib;
+
+ pkt_len = skb_headlen(skb);
+ q_idx = skb_get_queue_mapping(skb);
+
+ tx_chn = &emac->tx_chns[q_idx];
+ netif_txq = netdev_get_tx_queue(ndev, q_idx);
+
+ /* Map the linear buffer */
+ buf_dma = dma_map_single(tx_chn->dma_dev, skb->data, pkt_len, DMA_TO_DEVICE);
+ if (dma_mapping_error(tx_chn->dma_dev, buf_dma)) {
+ netdev_err(ndev, "tx: failed to map skb buffer\n");
+ ret = NETDEV_TX_OK;
+ goto drop_free_skb;
+ }
+
+ first_desc = k3_cppi_desc_pool_alloc(tx_chn->desc_pool);
+ if (!first_desc) {
+ netdev_dbg(ndev, "tx: failed to allocate descriptor\n");
+ dma_unmap_single(tx_chn->dma_dev, buf_dma, pkt_len, DMA_TO_DEVICE);
+ goto drop_stop_q_busy;
+ }
+
+ cppi5_hdesc_init(first_desc, CPPI5_INFO0_HDESC_EPIB_PRESENT,
+ PRUETH_NAV_PS_DATA_SIZE);
+ cppi5_hdesc_set_pkttype(first_desc, 0);
+ epib = first_desc->epib;
+ epib[0] = 0;
+ epib[1] = 0;
+ if (skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP &&
+ emac->tx_ts_enabled) {
+ tx_ts_cookie = prueth_tx_ts_cookie_get(emac);
+ if (tx_ts_cookie >= 0) {
+ skb_shinfo(skb)->tx_flags |= SKBTX_IN_PROGRESS;
+ /* Request TX timestamp */
+ epib[0] = (u32)tx_ts_cookie;
+ epib[1] = 0x80000000; /* TX TS request */
+ emac->tx_ts_skb[tx_ts_cookie] = skb_get(skb);
+ in_tx_ts = 1;
+ }
+ }
+
+ /* set dst tag to indicate internal qid at the firmware which is at
+ * bit8..bit15. bit0..bit7 indicates port num for directed
+ * packets in case of switch mode operation
+ */
+ cppi5_desc_set_tags_ids(&first_desc->hdr, 0, (emac->port_id | (q_idx << 8)));
+ k3_udma_glue_tx_dma_to_cppi5_addr(tx_chn->tx_chn, &buf_dma);
+ cppi5_hdesc_attach_buf(first_desc, buf_dma, pkt_len, buf_dma, pkt_len);
+ swdata = cppi5_hdesc_get_swdata(first_desc);
+ *swdata = skb;
+
+ /* Handle the case where skb is fragmented in pages */
+ cur_desc = first_desc;
+ for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
+ skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
+ u32 frag_size = skb_frag_size(frag);
+
+ next_desc = k3_cppi_desc_pool_alloc(tx_chn->desc_pool);
+ if (!next_desc) {
+ netdev_err(ndev,
+ "tx: failed to allocate frag. descriptor\n");
+ goto free_desc_stop_q_busy_cleanup_tx_ts;
+ }
+
+ buf_dma = skb_frag_dma_map(tx_chn->dma_dev, frag, 0, frag_size,
+ DMA_TO_DEVICE);
+ if (dma_mapping_error(tx_chn->dma_dev, buf_dma)) {
+ netdev_err(ndev, "tx: Failed to map skb page\n");
+ k3_cppi_desc_pool_free(tx_chn->desc_pool, next_desc);
+ ret = NETDEV_TX_OK;
+ goto cleanup_tx_ts;
+ }
+
+ cppi5_hdesc_reset_hbdesc(next_desc);
+ k3_udma_glue_tx_dma_to_cppi5_addr(tx_chn->tx_chn, &buf_dma);
+ cppi5_hdesc_attach_buf(next_desc,
+ buf_dma, frag_size, buf_dma, frag_size);
+
+ desc_dma = k3_cppi_desc_pool_virt2dma(tx_chn->desc_pool,
+ next_desc);
+ k3_udma_glue_tx_dma_to_cppi5_addr(tx_chn->tx_chn, &desc_dma);
+ cppi5_hdesc_link_hbdesc(cur_desc, desc_dma);
+
+ pkt_len += frag_size;
+ cur_desc = next_desc;
+ }
+ WARN_ON_ONCE(pkt_len != skb->len);
+
+ /* report bql before sending packet */
+ netdev_tx_sent_queue(netif_txq, pkt_len);
+
+ cppi5_hdesc_set_pktlen(first_desc, pkt_len);
+ desc_dma = k3_cppi_desc_pool_virt2dma(tx_chn->desc_pool, first_desc);
+ /* cppi5_desc_dump(first_desc, 64); */
+
+ skb_tx_timestamp(skb); /* SW timestamp if SKBTX_IN_PROGRESS not set */
+ ret = k3_udma_glue_push_tx_chn(tx_chn->tx_chn, first_desc, desc_dma);
+ if (ret) {
+ netdev_err(ndev, "tx: push failed: %d\n", ret);
+ goto drop_free_descs;
+ }
+
+ if (in_tx_ts)
+ atomic_inc(&emac->tx_ts_pending);
+
+ if (k3_cppi_desc_pool_avail(tx_chn->desc_pool) < MAX_SKB_FRAGS) {
+ netif_tx_stop_queue(netif_txq);
+ /* Barrier, so that stop_queue visible to other cpus */
+ smp_mb__after_atomic();
+
+ if (k3_cppi_desc_pool_avail(tx_chn->desc_pool) >=
+ MAX_SKB_FRAGS)
+ netif_tx_wake_queue(netif_txq);
+ }
+
+ return NETDEV_TX_OK;
+
+cleanup_tx_ts:
+ if (in_tx_ts) {
+ dev_kfree_skb_any(emac->tx_ts_skb[tx_ts_cookie]);
+ emac->tx_ts_skb[tx_ts_cookie] = NULL;
+ }
+
+drop_free_descs:
+ prueth_xmit_free(tx_chn, first_desc);
+
+drop_free_skb:
+ dev_kfree_skb_any(skb);
+
+ /* error */
+ ndev->stats.tx_dropped++;
+ netdev_err(ndev, "tx: error: %d\n", ret);
+
+ return ret;
+
+free_desc_stop_q_busy_cleanup_tx_ts:
+ if (in_tx_ts) {
+ dev_kfree_skb_any(emac->tx_ts_skb[tx_ts_cookie]);
+ emac->tx_ts_skb[tx_ts_cookie] = NULL;
+ }
+ prueth_xmit_free(tx_chn, first_desc);
+
+drop_stop_q_busy:
+ netif_tx_stop_queue(netif_txq);
+ return NETDEV_TX_BUSY;
+}
+
+static void prueth_tx_cleanup(void *data, dma_addr_t desc_dma)
+{
+ struct prueth_tx_chn *tx_chn = data;
+ struct cppi5_host_desc_t *desc_tx;
+ struct sk_buff *skb;
+ void **swdata;
+
+ desc_tx = k3_cppi_desc_pool_dma2virt(tx_chn->desc_pool, desc_dma);
+ swdata = cppi5_hdesc_get_swdata(desc_tx);
+ skb = *(swdata);
+ prueth_xmit_free(tx_chn, desc_tx);
+
+ dev_kfree_skb_any(skb);
+}
+
+irqreturn_t prueth_rx_irq(int irq, void *dev_id)
+{
+ struct prueth_emac *emac = dev_id;
+
+ disable_irq_nosync(irq);
+ napi_schedule(&emac->napi_rx);
+
+ return IRQ_HANDLED;
+}
+
+void prueth_emac_stop(struct prueth_emac *emac)
+{
+ struct prueth *prueth = emac->prueth;
+ int slice;
+
+ switch (emac->port_id) {
+ case PRUETH_PORT_MII0:
+ slice = ICSS_SLICE0;
+ break;
+ case PRUETH_PORT_MII1:
+ slice = ICSS_SLICE1;
+ break;
+ default:
+ netdev_err(emac->ndev, "invalid port\n");
+ return;
+ }
+
+ emac->fw_running = 0;
+ rproc_shutdown(prueth->txpru[slice]);
+ rproc_shutdown(prueth->rtu[slice]);
+ rproc_shutdown(prueth->pru[slice]);
+}
+
+void prueth_cleanup_tx_ts(struct prueth_emac *emac)
+{
+ int i;
+
+ for (i = 0; i < PRUETH_MAX_TX_TS_REQUESTS; i++) {
+ if (emac->tx_ts_skb[i]) {
+ dev_kfree_skb_any(emac->tx_ts_skb[i]);
+ emac->tx_ts_skb[i] = NULL;
+ }
+ }
+}
+
+int emac_napi_rx_poll(struct napi_struct *napi_rx, int budget)
+{
+ struct prueth_emac *emac = prueth_napi_to_emac(napi_rx);
+ int rx_flow = PRUETH_RX_FLOW_DATA;
+ int flow = PRUETH_MAX_RX_FLOWS;
+ int num_rx = 0;
+ int cur_budget;
+ int ret;
+
+ while (flow--) {
+ cur_budget = budget - num_rx;
+
+ while (cur_budget--) {
+ ret = emac_rx_packet(emac, flow);
+ if (ret)
+ break;
+ num_rx++;
+ }
+
+ if (num_rx >= budget)
+ break;
+ }
+
+ if (num_rx < budget && napi_complete_done(napi_rx, num_rx))
+ enable_irq(emac->rx_chns.irq[rx_flow]);
+
+ return num_rx;
+}
+
+int prueth_prepare_rx_chan(struct prueth_emac *emac,
+ struct prueth_rx_chn *chn,
+ int buf_size)
+{
+ struct sk_buff *skb;
+ int i, ret;
+
+ for (i = 0; i < chn->descs_num; i++) {
+ skb = __netdev_alloc_skb_ip_align(NULL, buf_size, GFP_KERNEL);
+ if (!skb)
+ return -ENOMEM;
+
+ ret = prueth_dma_rx_push(emac, skb, chn);
+ if (ret < 0) {
+ netdev_err(emac->ndev,
+ "cannot submit skb for rx chan %s ret %d\n",
+ chn->name, ret);
+ kfree_skb(skb);
+ return ret;
+ }
+ }
+
+ return 0;
+}
+
+void prueth_reset_tx_chan(struct prueth_emac *emac, int ch_num,
+ bool free_skb)
+{
+ int i;
+
+ for (i = 0; i < ch_num; i++) {
+ if (free_skb)
+ k3_udma_glue_reset_tx_chn(emac->tx_chns[i].tx_chn,
+ &emac->tx_chns[i],
+ prueth_tx_cleanup);
+ k3_udma_glue_disable_tx_chn(emac->tx_chns[i].tx_chn);
+ }
+}
+
+void prueth_reset_rx_chan(struct prueth_rx_chn *chn,
+ int num_flows, bool disable)
+{
+ int i;
+
+ for (i = 0; i < num_flows; i++)
+ k3_udma_glue_reset_rx_chn(chn->rx_chn, i, chn,
+ prueth_rx_cleanup, !!i);
+ if (disable)
+ k3_udma_glue_disable_rx_chn(chn->rx_chn);
+}
+
+void emac_ndo_tx_timeout(struct net_device *ndev, unsigned int txqueue)
+{
+ ndev->stats.tx_errors++;
+}
+
+static int emac_set_ts_config(struct net_device *ndev, struct ifreq *ifr)
+{
+ struct prueth_emac *emac = netdev_priv(ndev);
+ struct hwtstamp_config config;
+
+ if (copy_from_user(&config, ifr->ifr_data, sizeof(config)))
+ return -EFAULT;
+
+ switch (config.tx_type) {
+ case HWTSTAMP_TX_OFF:
+ emac->tx_ts_enabled = 0;
+ break;
+ case HWTSTAMP_TX_ON:
+ emac->tx_ts_enabled = 1;
+ break;
+ default:
+ return -ERANGE;
+ }
+
+ switch (config.rx_filter) {
+ case HWTSTAMP_FILTER_NONE:
+ emac->rx_ts_enabled = 0;
+ break;
+ case HWTSTAMP_FILTER_ALL:
+ case HWTSTAMP_FILTER_SOME:
+ case HWTSTAMP_FILTER_PTP_V1_L4_EVENT:
+ case HWTSTAMP_FILTER_PTP_V1_L4_SYNC:
+ case HWTSTAMP_FILTER_PTP_V1_L4_DELAY_REQ:
+ case HWTSTAMP_FILTER_PTP_V2_L4_EVENT:
+ case HWTSTAMP_FILTER_PTP_V2_L4_SYNC:
+ case HWTSTAMP_FILTER_PTP_V2_L4_DELAY_REQ:
+ case HWTSTAMP_FILTER_PTP_V2_L2_EVENT:
+ case HWTSTAMP_FILTER_PTP_V2_L2_SYNC:
+ case HWTSTAMP_FILTER_PTP_V2_L2_DELAY_REQ:
+ case HWTSTAMP_FILTER_PTP_V2_EVENT:
+ case HWTSTAMP_FILTER_PTP_V2_SYNC:
+ case HWTSTAMP_FILTER_PTP_V2_DELAY_REQ:
+ case HWTSTAMP_FILTER_NTP_ALL:
+ emac->rx_ts_enabled = 1;
+ config.rx_filter = HWTSTAMP_FILTER_ALL;
+ break;
+ default:
+ return -ERANGE;
+ }
+
+ return copy_to_user(ifr->ifr_data, &config, sizeof(config)) ?
+ -EFAULT : 0;
+}
+
+static int emac_get_ts_config(struct net_device *ndev, struct ifreq *ifr)
+{
+ struct prueth_emac *emac = netdev_priv(ndev);
+ struct hwtstamp_config config;
+
+ config.flags = 0;
+ config.tx_type = emac->tx_ts_enabled ? HWTSTAMP_TX_ON : HWTSTAMP_TX_OFF;
+ config.rx_filter = emac->rx_ts_enabled ? HWTSTAMP_FILTER_ALL : HWTSTAMP_FILTER_NONE;
+
+ return copy_to_user(ifr->ifr_data, &config, sizeof(config)) ?
+ -EFAULT : 0;
+}
+
+int emac_ndo_ioctl(struct net_device *ndev, struct ifreq *ifr, int cmd)
+{
+ switch (cmd) {
+ case SIOCGHWTSTAMP:
+ return emac_get_ts_config(ndev, ifr);
+ case SIOCSHWTSTAMP:
+ return emac_set_ts_config(ndev, ifr);
+ default:
+ break;
+ }
+
+ return phy_do_ioctl(ndev, ifr, cmd);
+}
+
+void emac_ndo_get_stats64(struct net_device *ndev,
+ struct rtnl_link_stats64 *stats)
+{
+ struct prueth_emac *emac = netdev_priv(ndev);
+
+ emac_update_hardware_stats(emac);
+
+ stats->rx_packets = emac_get_stat_by_name(emac, "rx_packets");
+ stats->rx_bytes = emac_get_stat_by_name(emac, "rx_bytes");
+ stats->tx_packets = emac_get_stat_by_name(emac, "tx_packets");
+ stats->tx_bytes = emac_get_stat_by_name(emac, "tx_bytes");
+ stats->rx_crc_errors = emac_get_stat_by_name(emac, "rx_crc_errors");
+ stats->rx_over_errors = emac_get_stat_by_name(emac, "rx_over_errors");
+ stats->multicast = emac_get_stat_by_name(emac, "rx_multicast_frames");
+
+ stats->rx_errors = ndev->stats.rx_errors;
+ stats->rx_dropped = ndev->stats.rx_dropped;
+ stats->tx_errors = ndev->stats.tx_errors;
+ stats->tx_dropped = ndev->stats.tx_dropped;
+}
+
+int emac_ndo_get_phys_port_name(struct net_device *ndev, char *name,
+ size_t len)
+{
+ struct prueth_emac *emac = netdev_priv(ndev);
+ int ret;
+
+ ret = snprintf(name, len, "p%d", emac->port_id);
+ if (ret >= len)
+ return -EINVAL;
+
+ return 0;
+}
+
+/* get emac_port corresponding to eth_node name */
+int prueth_node_port(struct device_node *eth_node)
+{
+ u32 port_id;
+ int ret;
+
+ ret = of_property_read_u32(eth_node, "reg", &port_id);
+ if (ret)
+ return ret;
+
+ if (port_id == 0)
+ return PRUETH_PORT_MII0;
+ else if (port_id == 1)
+ return PRUETH_PORT_MII1;
+ else
+ return PRUETH_PORT_INVALID;
+}
+
+/* get MAC instance corresponding to eth_node name */
+int prueth_node_mac(struct device_node *eth_node)
+{
+ u32 port_id;
+ int ret;
+
+ ret = of_property_read_u32(eth_node, "reg", &port_id);
+ if (ret)
+ return ret;
+
+ if (port_id == 0)
+ return PRUETH_MAC0;
+ else if (port_id == 1)
+ return PRUETH_MAC1;
+ else
+ return PRUETH_MAC_INVALID;
+}
+
+void prueth_netdev_exit(struct prueth *prueth,
+ struct device_node *eth_node)
+{
+ struct prueth_emac *emac;
+ enum prueth_mac mac;
+
+ mac = prueth_node_mac(eth_node);
+ if (mac == PRUETH_MAC_INVALID)
+ return;
+
+ emac = prueth->emac[mac];
+ if (!emac)
+ return;
+
+ if (of_phy_is_fixed_link(emac->phy_node))
+ of_phy_deregister_fixed_link(emac->phy_node);
+
+ netif_napi_del(&emac->napi_rx);
+
+ pruss_release_mem_region(prueth->pruss, &emac->dram);
+ destroy_workqueue(emac->cmd_wq);
+ free_netdev(emac->ndev);
+ prueth->emac[mac] = NULL;
+}
+
+int prueth_get_cores(struct prueth *prueth, int slice)
+{
+ struct device *dev = prueth->dev;
+ enum pruss_pru_id pruss_id;
+ struct device_node *np;
+ int idx = -1, ret;
+
+ np = dev->of_node;
+
+ switch (slice) {
+ case ICSS_SLICE0:
+ idx = 0;
+ break;
+ case ICSS_SLICE1:
+ idx = 3;
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ prueth->pru[slice] = pru_rproc_get(np, idx, &pruss_id);
+ if (IS_ERR(prueth->pru[slice])) {
+ ret = PTR_ERR(prueth->pru[slice]);
+ prueth->pru[slice] = NULL;
+ return dev_err_probe(dev, ret, "unable to get PRU%d\n", slice);
+ }
+ prueth->pru_id[slice] = pruss_id;
+
+ idx++;
+ prueth->rtu[slice] = pru_rproc_get(np, idx, NULL);
+ if (IS_ERR(prueth->rtu[slice])) {
+ ret = PTR_ERR(prueth->rtu[slice]);
+ prueth->rtu[slice] = NULL;
+ return dev_err_probe(dev, ret, "unable to get RTU%d\n", slice);
+ }
+
+ idx++;
+ prueth->txpru[slice] = pru_rproc_get(np, idx, NULL);
+ if (IS_ERR(prueth->txpru[slice])) {
+ ret = PTR_ERR(prueth->txpru[slice]);
+ prueth->txpru[slice] = NULL;
+ return dev_err_probe(dev, ret, "unable to get TX_PRU%d\n", slice);
+ }
+
+ return 0;
+}
+
+void prueth_put_cores(struct prueth *prueth, int slice)
+{
+ if (prueth->txpru[slice])
+ pru_rproc_put(prueth->txpru[slice]);
+
+ if (prueth->rtu[slice])
+ pru_rproc_put(prueth->rtu[slice]);
+
+ if (prueth->pru[slice])
+ pru_rproc_put(prueth->pru[slice]);
+}
+
+#ifdef CONFIG_PM_SLEEP
+static int prueth_suspend(struct device *dev)
+{
+ struct prueth *prueth = dev_get_drvdata(dev);
+ struct net_device *ndev;
+ int i, ret;
+
+ for (i = 0; i < PRUETH_NUM_MACS; i++) {
+ ndev = prueth->registered_netdevs[i];
+
+ if (!ndev)
+ continue;
+
+ if (netif_running(ndev)) {
+ netif_device_detach(ndev);
+ ret = ndev->netdev_ops->ndo_stop(ndev);
+ if (ret < 0) {
+ netdev_err(ndev, "failed to stop: %d", ret);
+ return ret;
+ }
+ }
+ }
+
+ return 0;
+}
+
+static int prueth_resume(struct device *dev)
+{
+ struct prueth *prueth = dev_get_drvdata(dev);
+ struct net_device *ndev;
+ int i, ret;
+
+ for (i = 0; i < PRUETH_NUM_MACS; i++) {
+ ndev = prueth->registered_netdevs[i];
+
+ if (!ndev)
+ continue;
+
+ if (netif_running(ndev)) {
+ ret = ndev->netdev_ops->ndo_open(ndev);
+ if (ret < 0) {
+ netdev_err(ndev, "failed to start: %d", ret);
+ return ret;
+ }
+ netif_device_attach(ndev);
+ }
+ }
+
+ return 0;
+}
+#endif /* CONFIG_PM_SLEEP */
+
+const struct dev_pm_ops prueth_dev_pm_ops = {
+ SET_SYSTEM_SLEEP_PM_OPS(prueth_suspend, prueth_resume)
+};
diff --git a/drivers/net/ethernet/ti/icssg/icssg_prueth.c b/drivers/net/ethernet/ti/icssg/icssg_prueth.c
index cf7b73f8f450..e6eac01f9f99 100644
--- a/drivers/net/ethernet/ti/icssg/icssg_prueth.c
+++ b/drivers/net/ethernet/ti/icssg/icssg_prueth.c
@@ -34,568 +34,9 @@
#define PRUETH_MODULE_DESCRIPTION "PRUSS ICSSG Ethernet driver"
-/* Netif debug messages possible */
-#define PRUETH_EMAC_DEBUG (NETIF_MSG_DRV | \
- NETIF_MSG_PROBE | \
- NETIF_MSG_LINK | \
- NETIF_MSG_TIMER | \
- NETIF_MSG_IFDOWN | \
- NETIF_MSG_IFUP | \
- NETIF_MSG_RX_ERR | \
- NETIF_MSG_TX_ERR | \
- NETIF_MSG_TX_QUEUED | \
- NETIF_MSG_INTR | \
- NETIF_MSG_TX_DONE | \
- NETIF_MSG_RX_STATUS | \
- NETIF_MSG_PKTDATA | \
- NETIF_MSG_HW | \
- NETIF_MSG_WOL)
-
-#define prueth_napi_to_emac(napi) container_of(napi, struct prueth_emac, napi_rx)
-
/* CTRLMMR_ICSSG_RGMII_CTRL register bits */
#define ICSSG_CTRL_RGMII_ID_MODE BIT(24)
-#define IEP_DEFAULT_CYCLE_TIME_NS 1000000 /* 1 ms */
-
-static void prueth_cleanup_rx_chns(struct prueth_emac *emac,
- struct prueth_rx_chn *rx_chn,
- int max_rflows)
-{
- if (rx_chn->desc_pool)
- k3_cppi_desc_pool_destroy(rx_chn->desc_pool);
-
- if (rx_chn->rx_chn)
- k3_udma_glue_release_rx_chn(rx_chn->rx_chn);
-}
-
-static void prueth_cleanup_tx_chns(struct prueth_emac *emac)
-{
- int i;
-
- for (i = 0; i < emac->tx_ch_num; i++) {
- struct prueth_tx_chn *tx_chn = &emac->tx_chns[i];
-
- if (tx_chn->desc_pool)
- k3_cppi_desc_pool_destroy(tx_chn->desc_pool);
-
- if (tx_chn->tx_chn)
- k3_udma_glue_release_tx_chn(tx_chn->tx_chn);
-
- /* Assume prueth_cleanup_tx_chns() is called at the
- * end after all channel resources are freed
- */
- memset(tx_chn, 0, sizeof(*tx_chn));
- }
-}
-
-static void prueth_ndev_del_tx_napi(struct prueth_emac *emac, int num)
-{
- int i;
-
- for (i = 0; i < num; i++) {
- struct prueth_tx_chn *tx_chn = &emac->tx_chns[i];
-
- if (tx_chn->irq)
- free_irq(tx_chn->irq, tx_chn);
- netif_napi_del(&tx_chn->napi_tx);
- }
-}
-
-static void prueth_xmit_free(struct prueth_tx_chn *tx_chn,
- struct cppi5_host_desc_t *desc)
-{
- struct cppi5_host_desc_t *first_desc, *next_desc;
- dma_addr_t buf_dma, next_desc_dma;
- u32 buf_dma_len;
-
- first_desc = desc;
- next_desc = first_desc;
-
- cppi5_hdesc_get_obuf(first_desc, &buf_dma, &buf_dma_len);
- k3_udma_glue_tx_cppi5_to_dma_addr(tx_chn->tx_chn, &buf_dma);
-
- dma_unmap_single(tx_chn->dma_dev, buf_dma, buf_dma_len,
- DMA_TO_DEVICE);
-
- next_desc_dma = cppi5_hdesc_get_next_hbdesc(first_desc);
- k3_udma_glue_tx_cppi5_to_dma_addr(tx_chn->tx_chn, &next_desc_dma);
- while (next_desc_dma) {
- next_desc = k3_cppi_desc_pool_dma2virt(tx_chn->desc_pool,
- next_desc_dma);
- cppi5_hdesc_get_obuf(next_desc, &buf_dma, &buf_dma_len);
- k3_udma_glue_tx_cppi5_to_dma_addr(tx_chn->tx_chn, &buf_dma);
-
- dma_unmap_page(tx_chn->dma_dev, buf_dma, buf_dma_len,
- DMA_TO_DEVICE);
-
- next_desc_dma = cppi5_hdesc_get_next_hbdesc(next_desc);
- k3_udma_glue_tx_cppi5_to_dma_addr(tx_chn->tx_chn, &next_desc_dma);
-
- k3_cppi_desc_pool_free(tx_chn->desc_pool, next_desc);
- }
-
- k3_cppi_desc_pool_free(tx_chn->desc_pool, first_desc);
-}
-
-static int emac_tx_complete_packets(struct prueth_emac *emac, int chn,
- int budget)
-{
- struct net_device *ndev = emac->ndev;
- struct cppi5_host_desc_t *desc_tx;
- struct netdev_queue *netif_txq;
- struct prueth_tx_chn *tx_chn;
- unsigned int total_bytes = 0;
- struct sk_buff *skb;
- dma_addr_t desc_dma;
- int res, num_tx = 0;
- void **swdata;
-
- tx_chn = &emac->tx_chns[chn];
-
- while (true) {
- res = k3_udma_glue_pop_tx_chn(tx_chn->tx_chn, &desc_dma);
- if (res == -ENODATA)
- break;
-
- /* teardown completion */
- if (cppi5_desc_is_tdcm(desc_dma)) {
- if (atomic_dec_and_test(&emac->tdown_cnt))
- complete(&emac->tdown_complete);
- break;
- }
-
- desc_tx = k3_cppi_desc_pool_dma2virt(tx_chn->desc_pool,
- desc_dma);
- swdata = cppi5_hdesc_get_swdata(desc_tx);
-
- skb = *(swdata);
- prueth_xmit_free(tx_chn, desc_tx);
-
- ndev = skb->dev;
- ndev->stats.tx_packets++;
- ndev->stats.tx_bytes += skb->len;
- total_bytes += skb->len;
- napi_consume_skb(skb, budget);
- num_tx++;
- }
-
- if (!num_tx)
- return 0;
-
- netif_txq = netdev_get_tx_queue(ndev, chn);
- netdev_tx_completed_queue(netif_txq, num_tx, total_bytes);
-
- if (netif_tx_queue_stopped(netif_txq)) {
- /* If the TX queue was stopped, wake it now
- * if we have enough room.
- */
- __netif_tx_lock(netif_txq, smp_processor_id());
- if (netif_running(ndev) &&
- (k3_cppi_desc_pool_avail(tx_chn->desc_pool) >=
- MAX_SKB_FRAGS))
- netif_tx_wake_queue(netif_txq);
- __netif_tx_unlock(netif_txq);
- }
-
- return num_tx;
-}
-
-static int emac_napi_tx_poll(struct napi_struct *napi_tx, int budget)
-{
- struct prueth_tx_chn *tx_chn = prueth_napi_to_tx_chn(napi_tx);
- struct prueth_emac *emac = tx_chn->emac;
- int num_tx_packets;
-
- num_tx_packets = emac_tx_complete_packets(emac, tx_chn->id, budget);
-
- if (num_tx_packets >= budget)
- return budget;
-
- if (napi_complete_done(napi_tx, num_tx_packets))
- enable_irq(tx_chn->irq);
-
- return num_tx_packets;
-}
-
-static irqreturn_t prueth_tx_irq(int irq, void *dev_id)
-{
- struct prueth_tx_chn *tx_chn = dev_id;
-
- disable_irq_nosync(irq);
- napi_schedule(&tx_chn->napi_tx);
-
- return IRQ_HANDLED;
-}
-
-static int prueth_ndev_add_tx_napi(struct prueth_emac *emac)
-{
- struct prueth *prueth = emac->prueth;
- int i, ret;
-
- for (i = 0; i < emac->tx_ch_num; i++) {
- struct prueth_tx_chn *tx_chn = &emac->tx_chns[i];
-
- netif_napi_add_tx(emac->ndev, &tx_chn->napi_tx, emac_napi_tx_poll);
- ret = request_irq(tx_chn->irq, prueth_tx_irq,
- IRQF_TRIGGER_HIGH, tx_chn->name,
- tx_chn);
- if (ret) {
- netif_napi_del(&tx_chn->napi_tx);
- dev_err(prueth->dev, "unable to request TX IRQ %d\n",
- tx_chn->irq);
- goto fail;
- }
- }
-
- return 0;
-fail:
- prueth_ndev_del_tx_napi(emac, i);
- return ret;
-}
-
-static int prueth_init_tx_chns(struct prueth_emac *emac)
-{
- static const struct k3_ring_cfg ring_cfg = {
- .elm_size = K3_RINGACC_RING_ELSIZE_8,
- .mode = K3_RINGACC_RING_MODE_RING,
- .flags = 0,
- .size = PRUETH_MAX_TX_DESC,
- };
- struct k3_udma_glue_tx_channel_cfg tx_cfg;
- struct device *dev = emac->prueth->dev;
- struct net_device *ndev = emac->ndev;
- int ret, slice, i;
- u32 hdesc_size;
-
- slice = prueth_emac_slice(emac);
- if (slice < 0)
- return slice;
-
- init_completion(&emac->tdown_complete);
-
- hdesc_size = cppi5_hdesc_calc_size(true, PRUETH_NAV_PS_DATA_SIZE,
- PRUETH_NAV_SW_DATA_SIZE);
- memset(&tx_cfg, 0, sizeof(tx_cfg));
- tx_cfg.swdata_size = PRUETH_NAV_SW_DATA_SIZE;
- tx_cfg.tx_cfg = ring_cfg;
- tx_cfg.txcq_cfg = ring_cfg;
-
- for (i = 0; i < emac->tx_ch_num; i++) {
- struct prueth_tx_chn *tx_chn = &emac->tx_chns[i];
-
- /* To differentiate channels for SLICE0 vs SLICE1 */
- snprintf(tx_chn->name, sizeof(tx_chn->name),
- "tx%d-%d", slice, i);
-
- tx_chn->emac = emac;
- tx_chn->id = i;
- tx_chn->descs_num = PRUETH_MAX_TX_DESC;
-
- tx_chn->tx_chn =
- k3_udma_glue_request_tx_chn(dev, tx_chn->name,
- &tx_cfg);
- if (IS_ERR(tx_chn->tx_chn)) {
- ret = PTR_ERR(tx_chn->tx_chn);
- tx_chn->tx_chn = NULL;
- netdev_err(ndev,
- "Failed to request tx dma ch: %d\n", ret);
- goto fail;
- }
-
- tx_chn->dma_dev = k3_udma_glue_tx_get_dma_device(tx_chn->tx_chn);
- tx_chn->desc_pool =
- k3_cppi_desc_pool_create_name(tx_chn->dma_dev,
- tx_chn->descs_num,
- hdesc_size,
- tx_chn->name);
- if (IS_ERR(tx_chn->desc_pool)) {
- ret = PTR_ERR(tx_chn->desc_pool);
- tx_chn->desc_pool = NULL;
- netdev_err(ndev, "Failed to create tx pool: %d\n", ret);
- goto fail;
- }
-
- ret = k3_udma_glue_tx_get_irq(tx_chn->tx_chn);
- if (ret < 0) {
- netdev_err(ndev, "failed to get tx irq\n");
- goto fail;
- }
- tx_chn->irq = ret;
-
- snprintf(tx_chn->name, sizeof(tx_chn->name), "%s-tx%d",
- dev_name(dev), tx_chn->id);
- }
-
- return 0;
-
-fail:
- prueth_cleanup_tx_chns(emac);
- return ret;
-}
-
-static int prueth_init_rx_chns(struct prueth_emac *emac,
- struct prueth_rx_chn *rx_chn,
- char *name, u32 max_rflows,
- u32 max_desc_num)
-{
- struct k3_udma_glue_rx_channel_cfg rx_cfg;
- struct device *dev = emac->prueth->dev;
- struct net_device *ndev = emac->ndev;
- u32 fdqring_id, hdesc_size;
- int i, ret = 0, slice;
-
- slice = prueth_emac_slice(emac);
- if (slice < 0)
- return slice;
-
- /* To differentiate channels for SLICE0 vs SLICE1 */
- snprintf(rx_chn->name, sizeof(rx_chn->name), "%s%d", name, slice);
-
- hdesc_size = cppi5_hdesc_calc_size(true, PRUETH_NAV_PS_DATA_SIZE,
- PRUETH_NAV_SW_DATA_SIZE);
- memset(&rx_cfg, 0, sizeof(rx_cfg));
- rx_cfg.swdata_size = PRUETH_NAV_SW_DATA_SIZE;
- rx_cfg.flow_id_num = max_rflows;
- rx_cfg.flow_id_base = -1; /* udmax will auto select flow id base */
-
- /* init all flows */
- rx_chn->dev = dev;
- rx_chn->descs_num = max_desc_num;
-
- rx_chn->rx_chn = k3_udma_glue_request_rx_chn(dev, rx_chn->name,
- &rx_cfg);
- if (IS_ERR(rx_chn->rx_chn)) {
- ret = PTR_ERR(rx_chn->rx_chn);
- rx_chn->rx_chn = NULL;
- netdev_err(ndev, "Failed to request rx dma ch: %d\n", ret);
- goto fail;
- }
-
- rx_chn->dma_dev = k3_udma_glue_rx_get_dma_device(rx_chn->rx_chn);
- rx_chn->desc_pool = k3_cppi_desc_pool_create_name(rx_chn->dma_dev,
- rx_chn->descs_num,
- hdesc_size,
- rx_chn->name);
- if (IS_ERR(rx_chn->desc_pool)) {
- ret = PTR_ERR(rx_chn->desc_pool);
- rx_chn->desc_pool = NULL;
- netdev_err(ndev, "Failed to create rx pool: %d\n", ret);
- goto fail;
- }
-
- emac->rx_flow_id_base = k3_udma_glue_rx_get_flow_id_base(rx_chn->rx_chn);
- netdev_dbg(ndev, "flow id base = %d\n", emac->rx_flow_id_base);
-
- fdqring_id = K3_RINGACC_RING_ID_ANY;
- for (i = 0; i < rx_cfg.flow_id_num; i++) {
- struct k3_ring_cfg rxring_cfg = {
- .elm_size = K3_RINGACC_RING_ELSIZE_8,
- .mode = K3_RINGACC_RING_MODE_RING,
- .flags = 0,
- };
- struct k3_ring_cfg fdqring_cfg = {
- .elm_size = K3_RINGACC_RING_ELSIZE_8,
- .flags = K3_RINGACC_RING_SHARED,
- };
- struct k3_udma_glue_rx_flow_cfg rx_flow_cfg = {
- .rx_cfg = rxring_cfg,
- .rxfdq_cfg = fdqring_cfg,
- .ring_rxq_id = K3_RINGACC_RING_ID_ANY,
- .src_tag_lo_sel =
- K3_UDMA_GLUE_SRC_TAG_LO_USE_REMOTE_SRC_TAG,
- };
-
- rx_flow_cfg.ring_rxfdq0_id = fdqring_id;
- rx_flow_cfg.rx_cfg.size = max_desc_num;
- rx_flow_cfg.rxfdq_cfg.size = max_desc_num;
- rx_flow_cfg.rxfdq_cfg.mode = emac->prueth->pdata.fdqring_mode;
-
- ret = k3_udma_glue_rx_flow_init(rx_chn->rx_chn,
- i, &rx_flow_cfg);
- if (ret) {
- netdev_err(ndev, "Failed to init rx flow%d %d\n",
- i, ret);
- goto fail;
- }
- if (!i)
- fdqring_id = k3_udma_glue_rx_flow_get_fdq_id(rx_chn->rx_chn,
- i);
- rx_chn->irq[i] = k3_udma_glue_rx_get_irq(rx_chn->rx_chn, i);
- if (rx_chn->irq[i] <= 0) {
- ret = rx_chn->irq[i];
- netdev_err(ndev, "Failed to get rx dma irq");
- goto fail;
- }
- }
-
- return 0;
-
-fail:
- prueth_cleanup_rx_chns(emac, rx_chn, max_rflows);
- return ret;
-}
-
-static int prueth_dma_rx_push(struct prueth_emac *emac,
- struct sk_buff *skb,
- struct prueth_rx_chn *rx_chn)
-{
- struct net_device *ndev = emac->ndev;
- struct cppi5_host_desc_t *desc_rx;
- u32 pkt_len = skb_tailroom(skb);
- dma_addr_t desc_dma;
- dma_addr_t buf_dma;
- void **swdata;
-
- desc_rx = k3_cppi_desc_pool_alloc(rx_chn->desc_pool);
- if (!desc_rx) {
- netdev_err(ndev, "rx push: failed to allocate descriptor\n");
- return -ENOMEM;
- }
- desc_dma = k3_cppi_desc_pool_virt2dma(rx_chn->desc_pool, desc_rx);
-
- buf_dma = dma_map_single(rx_chn->dma_dev, skb->data, pkt_len, DMA_FROM_DEVICE);
- if (unlikely(dma_mapping_error(rx_chn->dma_dev, buf_dma))) {
- k3_cppi_desc_pool_free(rx_chn->desc_pool, desc_rx);
- netdev_err(ndev, "rx push: failed to map rx pkt buffer\n");
- return -EINVAL;
- }
-
- cppi5_hdesc_init(desc_rx, CPPI5_INFO0_HDESC_EPIB_PRESENT,
- PRUETH_NAV_PS_DATA_SIZE);
- k3_udma_glue_rx_dma_to_cppi5_addr(rx_chn->rx_chn, &buf_dma);
- cppi5_hdesc_attach_buf(desc_rx, buf_dma, skb_tailroom(skb), buf_dma, skb_tailroom(skb));
-
- swdata = cppi5_hdesc_get_swdata(desc_rx);
- *swdata = skb;
-
- return k3_udma_glue_push_rx_chn(rx_chn->rx_chn, 0,
- desc_rx, desc_dma);
-}
-
-static u64 icssg_ts_to_ns(u32 hi_sw, u32 hi, u32 lo, u32 cycle_time_ns)
-{
- u32 iepcount_lo, iepcount_hi, hi_rollover_count;
- u64 ns;
-
- iepcount_lo = lo & GENMASK(19, 0);
- iepcount_hi = (hi & GENMASK(11, 0)) << 12 | lo >> 20;
- hi_rollover_count = hi >> 11;
-
- ns = ((u64)hi_rollover_count) << 23 | (iepcount_hi + hi_sw);
- ns = ns * cycle_time_ns + iepcount_lo;
-
- return ns;
-}
-
-static void emac_rx_timestamp(struct prueth_emac *emac,
- struct sk_buff *skb, u32 *psdata)
-{
- struct skb_shared_hwtstamps *ssh;
- u64 ns;
-
- u32 hi_sw = readl(emac->prueth->shram.va +
- TIMESYNC_FW_WC_COUNT_HI_SW_OFFSET_OFFSET);
- ns = icssg_ts_to_ns(hi_sw, psdata[1], psdata[0],
- IEP_DEFAULT_CYCLE_TIME_NS);
-
- ssh = skb_hwtstamps(skb);
- memset(ssh, 0, sizeof(*ssh));
- ssh->hwtstamp = ns_to_ktime(ns);
-}
-
-static int emac_rx_packet(struct prueth_emac *emac, u32 flow_id)
-{
- struct prueth_rx_chn *rx_chn = &emac->rx_chns;
- u32 buf_dma_len, pkt_len, port_id = 0;
- struct net_device *ndev = emac->ndev;
- struct cppi5_host_desc_t *desc_rx;
- struct sk_buff *skb, *new_skb;
- dma_addr_t desc_dma, buf_dma;
- void **swdata;
- u32 *psdata;
- int ret;
-
- ret = k3_udma_glue_pop_rx_chn(rx_chn->rx_chn, flow_id, &desc_dma);
- if (ret) {
- if (ret != -ENODATA)
- netdev_err(ndev, "rx pop: failed: %d\n", ret);
- return ret;
- }
-
- if (cppi5_desc_is_tdcm(desc_dma)) /* Teardown ? */
- return 0;
-
- desc_rx = k3_cppi_desc_pool_dma2virt(rx_chn->desc_pool, desc_dma);
-
- swdata = cppi5_hdesc_get_swdata(desc_rx);
- skb = *swdata;
-
- psdata = cppi5_hdesc_get_psdata(desc_rx);
- /* RX HW timestamp */
- if (emac->rx_ts_enabled)
- emac_rx_timestamp(emac, skb, psdata);
-
- cppi5_hdesc_get_obuf(desc_rx, &buf_dma, &buf_dma_len);
- k3_udma_glue_rx_cppi5_to_dma_addr(rx_chn->rx_chn, &buf_dma);
- pkt_len = cppi5_hdesc_get_pktlen(desc_rx);
- /* firmware adds 4 CRC bytes, strip them */
- pkt_len -= 4;
- cppi5_desc_get_tags_ids(&desc_rx->hdr, &port_id, NULL);
-
- dma_unmap_single(rx_chn->dma_dev, buf_dma, buf_dma_len, DMA_FROM_DEVICE);
- k3_cppi_desc_pool_free(rx_chn->desc_pool, desc_rx);
-
- skb->dev = ndev;
- new_skb = netdev_alloc_skb_ip_align(ndev, PRUETH_MAX_PKT_SIZE);
- /* if allocation fails we drop the packet but push the
- * descriptor back to the ring with old skb to prevent a stall
- */
- if (!new_skb) {
- ndev->stats.rx_dropped++;
- new_skb = skb;
- } else {
- /* send the filled skb up the n/w stack */
- skb_put(skb, pkt_len);
- skb->protocol = eth_type_trans(skb, ndev);
- napi_gro_receive(&emac->napi_rx, skb);
- ndev->stats.rx_bytes += pkt_len;
- ndev->stats.rx_packets++;
- }
-
- /* queue another RX DMA */
- ret = prueth_dma_rx_push(emac, new_skb, &emac->rx_chns);
- if (WARN_ON(ret < 0)) {
- dev_kfree_skb_any(new_skb);
- ndev->stats.rx_errors++;
- ndev->stats.rx_dropped++;
- }
-
- return ret;
-}
-
-static void prueth_rx_cleanup(void *data, dma_addr_t desc_dma)
-{
- struct prueth_rx_chn *rx_chn = data;
- struct cppi5_host_desc_t *desc_rx;
- struct sk_buff *skb;
- dma_addr_t buf_dma;
- u32 buf_dma_len;
- void **swdata;
-
- desc_rx = k3_cppi_desc_pool_dma2virt(rx_chn->desc_pool, desc_dma);
- swdata = cppi5_hdesc_get_swdata(desc_rx);
- skb = *swdata;
- cppi5_hdesc_get_obuf(desc_rx, &buf_dma, &buf_dma_len);
- k3_udma_glue_rx_cppi5_to_dma_addr(rx_chn->rx_chn, &buf_dma);
-
- dma_unmap_single(rx_chn->dma_dev, buf_dma, buf_dma_len,
- DMA_FROM_DEVICE);
- k3_cppi_desc_pool_free(rx_chn->desc_pool, desc_rx);
-
- dev_kfree_skb_any(skb);
-}
-
static int emac_get_tx_ts(struct prueth_emac *emac,
struct emac_tx_ts_response *rsp)
{
@@ -661,208 +102,6 @@ static void tx_ts_work(struct prueth_emac *emac)
}
}
-static int prueth_tx_ts_cookie_get(struct prueth_emac *emac)
-{
- int i;
-
- /* search and get the next free slot */
- for (i = 0; i < PRUETH_MAX_TX_TS_REQUESTS; i++) {
- if (!emac->tx_ts_skb[i]) {
- emac->tx_ts_skb[i] = ERR_PTR(-EBUSY); /* reserve slot */
- return i;
- }
- }
-
- return -EBUSY;
-}
-
-/**
- * emac_ndo_start_xmit - EMAC Transmit function
- * @skb: SKB pointer
- * @ndev: EMAC network adapter
- *
- * Called by the system to transmit a packet - we queue the packet in
- * EMAC hardware transmit queue
- * Doesn't wait for completion we'll check for TX completion in
- * emac_tx_complete_packets().
- *
- * Return: enum netdev_tx
- */
-static enum netdev_tx emac_ndo_start_xmit(struct sk_buff *skb, struct net_device *ndev)
-{
- struct cppi5_host_desc_t *first_desc, *next_desc, *cur_desc;
- struct prueth_emac *emac = netdev_priv(ndev);
- struct netdev_queue *netif_txq;
- struct prueth_tx_chn *tx_chn;
- dma_addr_t desc_dma, buf_dma;
- int i, ret = 0, q_idx;
- bool in_tx_ts = 0;
- int tx_ts_cookie;
- void **swdata;
- u32 pkt_len;
- u32 *epib;
-
- pkt_len = skb_headlen(skb);
- q_idx = skb_get_queue_mapping(skb);
-
- tx_chn = &emac->tx_chns[q_idx];
- netif_txq = netdev_get_tx_queue(ndev, q_idx);
-
- /* Map the linear buffer */
- buf_dma = dma_map_single(tx_chn->dma_dev, skb->data, pkt_len, DMA_TO_DEVICE);
- if (dma_mapping_error(tx_chn->dma_dev, buf_dma)) {
- netdev_err(ndev, "tx: failed to map skb buffer\n");
- ret = NETDEV_TX_OK;
- goto drop_free_skb;
- }
-
- first_desc = k3_cppi_desc_pool_alloc(tx_chn->desc_pool);
- if (!first_desc) {
- netdev_dbg(ndev, "tx: failed to allocate descriptor\n");
- dma_unmap_single(tx_chn->dma_dev, buf_dma, pkt_len, DMA_TO_DEVICE);
- goto drop_stop_q_busy;
- }
-
- cppi5_hdesc_init(first_desc, CPPI5_INFO0_HDESC_EPIB_PRESENT,
- PRUETH_NAV_PS_DATA_SIZE);
- cppi5_hdesc_set_pkttype(first_desc, 0);
- epib = first_desc->epib;
- epib[0] = 0;
- epib[1] = 0;
- if (skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP &&
- emac->tx_ts_enabled) {
- tx_ts_cookie = prueth_tx_ts_cookie_get(emac);
- if (tx_ts_cookie >= 0) {
- skb_shinfo(skb)->tx_flags |= SKBTX_IN_PROGRESS;
- /* Request TX timestamp */
- epib[0] = (u32)tx_ts_cookie;
- epib[1] = 0x80000000; /* TX TS request */
- emac->tx_ts_skb[tx_ts_cookie] = skb_get(skb);
- in_tx_ts = 1;
- }
- }
-
- /* set dst tag to indicate internal qid at the firmware which is at
- * bit8..bit15. bit0..bit7 indicates port num for directed
- * packets in case of switch mode operation
- */
- cppi5_desc_set_tags_ids(&first_desc->hdr, 0, (emac->port_id | (q_idx << 8)));
- k3_udma_glue_tx_dma_to_cppi5_addr(tx_chn->tx_chn, &buf_dma);
- cppi5_hdesc_attach_buf(first_desc, buf_dma, pkt_len, buf_dma, pkt_len);
- swdata = cppi5_hdesc_get_swdata(first_desc);
- *swdata = skb;
-
- /* Handle the case where skb is fragmented in pages */
- cur_desc = first_desc;
- for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
- skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
- u32 frag_size = skb_frag_size(frag);
-
- next_desc = k3_cppi_desc_pool_alloc(tx_chn->desc_pool);
- if (!next_desc) {
- netdev_err(ndev,
- "tx: failed to allocate frag. descriptor\n");
- goto free_desc_stop_q_busy_cleanup_tx_ts;
- }
-
- buf_dma = skb_frag_dma_map(tx_chn->dma_dev, frag, 0, frag_size,
- DMA_TO_DEVICE);
- if (dma_mapping_error(tx_chn->dma_dev, buf_dma)) {
- netdev_err(ndev, "tx: Failed to map skb page\n");
- k3_cppi_desc_pool_free(tx_chn->desc_pool, next_desc);
- ret = NETDEV_TX_OK;
- goto cleanup_tx_ts;
- }
-
- cppi5_hdesc_reset_hbdesc(next_desc);
- k3_udma_glue_tx_dma_to_cppi5_addr(tx_chn->tx_chn, &buf_dma);
- cppi5_hdesc_attach_buf(next_desc,
- buf_dma, frag_size, buf_dma, frag_size);
-
- desc_dma = k3_cppi_desc_pool_virt2dma(tx_chn->desc_pool,
- next_desc);
- k3_udma_glue_tx_dma_to_cppi5_addr(tx_chn->tx_chn, &desc_dma);
- cppi5_hdesc_link_hbdesc(cur_desc, desc_dma);
-
- pkt_len += frag_size;
- cur_desc = next_desc;
- }
- WARN_ON_ONCE(pkt_len != skb->len);
-
- /* report bql before sending packet */
- netdev_tx_sent_queue(netif_txq, pkt_len);
-
- cppi5_hdesc_set_pktlen(first_desc, pkt_len);
- desc_dma = k3_cppi_desc_pool_virt2dma(tx_chn->desc_pool, first_desc);
- /* cppi5_desc_dump(first_desc, 64); */
-
- skb_tx_timestamp(skb); /* SW timestamp if SKBTX_IN_PROGRESS not set */
- ret = k3_udma_glue_push_tx_chn(tx_chn->tx_chn, first_desc, desc_dma);
- if (ret) {
- netdev_err(ndev, "tx: push failed: %d\n", ret);
- goto drop_free_descs;
- }
-
- if (in_tx_ts)
- atomic_inc(&emac->tx_ts_pending);
-
- if (k3_cppi_desc_pool_avail(tx_chn->desc_pool) < MAX_SKB_FRAGS) {
- netif_tx_stop_queue(netif_txq);
- /* Barrier, so that stop_queue visible to other cpus */
- smp_mb__after_atomic();
-
- if (k3_cppi_desc_pool_avail(tx_chn->desc_pool) >=
- MAX_SKB_FRAGS)
- netif_tx_wake_queue(netif_txq);
- }
-
- return NETDEV_TX_OK;
-
-cleanup_tx_ts:
- if (in_tx_ts) {
- dev_kfree_skb_any(emac->tx_ts_skb[tx_ts_cookie]);
- emac->tx_ts_skb[tx_ts_cookie] = NULL;
- }
-
-drop_free_descs:
- prueth_xmit_free(tx_chn, first_desc);
-
-drop_free_skb:
- dev_kfree_skb_any(skb);
-
- /* error */
- ndev->stats.tx_dropped++;
- netdev_err(ndev, "tx: error: %d\n", ret);
-
- return ret;
-
-free_desc_stop_q_busy_cleanup_tx_ts:
- if (in_tx_ts) {
- dev_kfree_skb_any(emac->tx_ts_skb[tx_ts_cookie]);
- emac->tx_ts_skb[tx_ts_cookie] = NULL;
- }
- prueth_xmit_free(tx_chn, first_desc);
-
-drop_stop_q_busy:
- netif_tx_stop_queue(netif_txq);
- return NETDEV_TX_BUSY;
-}
-
-static void prueth_tx_cleanup(void *data, dma_addr_t desc_dma)
-{
- struct prueth_tx_chn *tx_chn = data;
- struct cppi5_host_desc_t *desc_tx;
- struct sk_buff *skb;
- void **swdata;
-
- desc_tx = k3_cppi_desc_pool_dma2virt(tx_chn->desc_pool, desc_dma);
- swdata = cppi5_hdesc_get_swdata(desc_tx);
- skb = *(swdata);
- prueth_xmit_free(tx_chn, desc_tx);
-
- dev_kfree_skb_any(skb);
-}
-
static irqreturn_t prueth_tx_ts_irq(int irq, void *dev_id)
{
struct prueth_emac *emac = dev_id;
@@ -873,22 +112,6 @@ static irqreturn_t prueth_tx_ts_irq(int irq, void *dev_id)
return IRQ_HANDLED;
}
-static irqreturn_t prueth_rx_irq(int irq, void *dev_id)
-{
- struct prueth_emac *emac = dev_id;
-
- disable_irq_nosync(irq);
- napi_schedule(&emac->napi_rx);
-
- return IRQ_HANDLED;
-}
-
-struct icssg_firmwares {
- char *pru;
- char *rtu;
- char *txpru;
-};
-
static struct icssg_firmwares icssg_emac_firmwares[] = {
{
.pru = "ti-pruss/am65x-sr2-pru0-prueth-fw.elf",
@@ -953,41 +176,6 @@ static int prueth_emac_start(struct prueth *prueth, struct prueth_emac *emac)
return ret;
}
-static void prueth_emac_stop(struct prueth_emac *emac)
-{
- struct prueth *prueth = emac->prueth;
- int slice;
-
- switch (emac->port_id) {
- case PRUETH_PORT_MII0:
- slice = ICSS_SLICE0;
- break;
- case PRUETH_PORT_MII1:
- slice = ICSS_SLICE1;
- break;
- default:
- netdev_err(emac->ndev, "invalid port\n");
- return;
- }
-
- emac->fw_running = 0;
- rproc_shutdown(prueth->txpru[slice]);
- rproc_shutdown(prueth->rtu[slice]);
- rproc_shutdown(prueth->pru[slice]);
-}
-
-static void prueth_cleanup_tx_ts(struct prueth_emac *emac)
-{
- int i;
-
- for (i = 0; i < PRUETH_MAX_TX_TS_REQUESTS; i++) {
- if (emac->tx_ts_skb[i]) {
- dev_kfree_skb_any(emac->tx_ts_skb[i]);
- emac->tx_ts_skb[i] = NULL;
- }
- }
-}
-
/* called back by PHY layer if there is change in link state of hw port*/
static void emac_adjust_link(struct net_device *ndev)
{
@@ -1055,86 +243,6 @@ static void emac_adjust_link(struct net_device *ndev)
}
}
-static int emac_napi_rx_poll(struct napi_struct *napi_rx, int budget)
-{
- struct prueth_emac *emac = prueth_napi_to_emac(napi_rx);
- int rx_flow = PRUETH_RX_FLOW_DATA;
- int flow = PRUETH_MAX_RX_FLOWS;
- int num_rx = 0;
- int cur_budget;
- int ret;
-
- while (flow--) {
- cur_budget = budget - num_rx;
-
- while (cur_budget--) {
- ret = emac_rx_packet(emac, flow);
- if (ret)
- break;
- num_rx++;
- }
-
- if (num_rx >= budget)
- break;
- }
-
- if (num_rx < budget && napi_complete_done(napi_rx, num_rx))
- enable_irq(emac->rx_chns.irq[rx_flow]);
-
- return num_rx;
-}
-
-static int prueth_prepare_rx_chan(struct prueth_emac *emac,
- struct prueth_rx_chn *chn,
- int buf_size)
-{
- struct sk_buff *skb;
- int i, ret;
-
- for (i = 0; i < chn->descs_num; i++) {
- skb = __netdev_alloc_skb_ip_align(NULL, buf_size, GFP_KERNEL);
- if (!skb)
- return -ENOMEM;
-
- ret = prueth_dma_rx_push(emac, skb, chn);
- if (ret < 0) {
- netdev_err(emac->ndev,
- "cannot submit skb for rx chan %s ret %d\n",
- chn->name, ret);
- kfree_skb(skb);
- return ret;
- }
- }
-
- return 0;
-}
-
-static void prueth_reset_tx_chan(struct prueth_emac *emac, int ch_num,
- bool free_skb)
-{
- int i;
-
- for (i = 0; i < ch_num; i++) {
- if (free_skb)
- k3_udma_glue_reset_tx_chn(emac->tx_chns[i].tx_chn,
- &emac->tx_chns[i],
- prueth_tx_cleanup);
- k3_udma_glue_disable_tx_chn(emac->tx_chns[i].tx_chn);
- }
-}
-
-static void prueth_reset_rx_chan(struct prueth_rx_chn *chn,
- int num_flows, bool disable)
-{
- int i;
-
- for (i = 0; i < num_flows; i++)
- k3_udma_glue_reset_rx_chn(chn->rx_chn, i, chn,
- prueth_rx_cleanup, !!i);
- if (disable)
- k3_udma_glue_disable_rx_chn(chn->rx_chn);
-}
-
static int emac_phy_connect(struct prueth_emac *emac)
{
struct prueth *prueth = emac->prueth;
@@ -1508,11 +616,6 @@ static int emac_ndo_stop(struct net_device *ndev)
return 0;
}
-static void emac_ndo_tx_timeout(struct net_device *ndev, unsigned int txqueue)
-{
- ndev->stats.tx_errors++;
-}
-
static void emac_ndo_set_rx_mode_work(struct work_struct *work)
{
struct prueth_emac *emac = container_of(work, struct prueth_emac, rx_mode_work);
@@ -1558,116 +661,6 @@ static void emac_ndo_set_rx_mode(struct net_device *ndev)
queue_work(emac->cmd_wq, &emac->rx_mode_work);
}
-static int emac_set_ts_config(struct net_device *ndev, struct ifreq *ifr)
-{
- struct prueth_emac *emac = netdev_priv(ndev);
- struct hwtstamp_config config;
-
- if (copy_from_user(&config, ifr->ifr_data, sizeof(config)))
- return -EFAULT;
-
- switch (config.tx_type) {
- case HWTSTAMP_TX_OFF:
- emac->tx_ts_enabled = 0;
- break;
- case HWTSTAMP_TX_ON:
- emac->tx_ts_enabled = 1;
- break;
- default:
- return -ERANGE;
- }
-
- switch (config.rx_filter) {
- case HWTSTAMP_FILTER_NONE:
- emac->rx_ts_enabled = 0;
- break;
- case HWTSTAMP_FILTER_ALL:
- case HWTSTAMP_FILTER_SOME:
- case HWTSTAMP_FILTER_PTP_V1_L4_EVENT:
- case HWTSTAMP_FILTER_PTP_V1_L4_SYNC:
- case HWTSTAMP_FILTER_PTP_V1_L4_DELAY_REQ:
- case HWTSTAMP_FILTER_PTP_V2_L4_EVENT:
- case HWTSTAMP_FILTER_PTP_V2_L4_SYNC:
- case HWTSTAMP_FILTER_PTP_V2_L4_DELAY_REQ:
- case HWTSTAMP_FILTER_PTP_V2_L2_EVENT:
- case HWTSTAMP_FILTER_PTP_V2_L2_SYNC:
- case HWTSTAMP_FILTER_PTP_V2_L2_DELAY_REQ:
- case HWTSTAMP_FILTER_PTP_V2_EVENT:
- case HWTSTAMP_FILTER_PTP_V2_SYNC:
- case HWTSTAMP_FILTER_PTP_V2_DELAY_REQ:
- case HWTSTAMP_FILTER_NTP_ALL:
- emac->rx_ts_enabled = 1;
- config.rx_filter = HWTSTAMP_FILTER_ALL;
- break;
- default:
- return -ERANGE;
- }
-
- return copy_to_user(ifr->ifr_data, &config, sizeof(config)) ?
- -EFAULT : 0;
-}
-
-static int emac_get_ts_config(struct net_device *ndev, struct ifreq *ifr)
-{
- struct prueth_emac *emac = netdev_priv(ndev);
- struct hwtstamp_config config;
-
- config.flags = 0;
- config.tx_type = emac->tx_ts_enabled ? HWTSTAMP_TX_ON : HWTSTAMP_TX_OFF;
- config.rx_filter = emac->rx_ts_enabled ? HWTSTAMP_FILTER_ALL : HWTSTAMP_FILTER_NONE;
-
- return copy_to_user(ifr->ifr_data, &config, sizeof(config)) ?
- -EFAULT : 0;
-}
-
-static int emac_ndo_ioctl(struct net_device *ndev, struct ifreq *ifr, int cmd)
-{
- switch (cmd) {
- case SIOCGHWTSTAMP:
- return emac_get_ts_config(ndev, ifr);
- case SIOCSHWTSTAMP:
- return emac_set_ts_config(ndev, ifr);
- default:
- break;
- }
-
- return phy_do_ioctl(ndev, ifr, cmd);
-}
-
-static void emac_ndo_get_stats64(struct net_device *ndev,
- struct rtnl_link_stats64 *stats)
-{
- struct prueth_emac *emac = netdev_priv(ndev);
-
- emac_update_hardware_stats(emac);
-
- stats->rx_packets = emac_get_stat_by_name(emac, "rx_packets");
- stats->rx_bytes = emac_get_stat_by_name(emac, "rx_bytes");
- stats->tx_packets = emac_get_stat_by_name(emac, "tx_packets");
- stats->tx_bytes = emac_get_stat_by_name(emac, "tx_bytes");
- stats->rx_crc_errors = emac_get_stat_by_name(emac, "rx_crc_errors");
- stats->rx_over_errors = emac_get_stat_by_name(emac, "rx_over_errors");
- stats->multicast = emac_get_stat_by_name(emac, "rx_multicast_frames");
-
- stats->rx_errors = ndev->stats.rx_errors;
- stats->rx_dropped = ndev->stats.rx_dropped;
- stats->tx_errors = ndev->stats.tx_errors;
- stats->tx_dropped = ndev->stats.tx_dropped;
-}
-
-static int emac_ndo_get_phys_port_name(struct net_device *ndev, char *name,
- size_t len)
-{
- struct prueth_emac *emac = netdev_priv(ndev);
- int ret;
-
- ret = snprintf(name, len, "p%d", emac->port_id);
- if (ret >= len)
- return -EINVAL;
-
- return 0;
-}
-
static const struct net_device_ops emac_netdev_ops = {
.ndo_open = emac_ndo_open,
.ndo_stop = emac_ndo_stop,
@@ -1681,42 +674,6 @@ static const struct net_device_ops emac_netdev_ops = {
.ndo_get_phys_port_name = emac_ndo_get_phys_port_name,
};
-/* get emac_port corresponding to eth_node name */
-static int prueth_node_port(struct device_node *eth_node)
-{
- u32 port_id;
- int ret;
-
- ret = of_property_read_u32(eth_node, "reg", &port_id);
- if (ret)
- return ret;
-
- if (port_id == 0)
- return PRUETH_PORT_MII0;
- else if (port_id == 1)
- return PRUETH_PORT_MII1;
- else
- return PRUETH_PORT_INVALID;
-}
-
-/* get MAC instance corresponding to eth_node name */
-static int prueth_node_mac(struct device_node *eth_node)
-{
- u32 port_id;
- int ret;
-
- ret = of_property_read_u32(eth_node, "reg", &port_id);
- if (ret)
- return ret;
-
- if (port_id == 0)
- return PRUETH_MAC0;
- else if (port_id == 1)
- return PRUETH_MAC1;
- else
- return PRUETH_MAC_INVALID;
-}
-
static int prueth_netdev_init(struct prueth *prueth,
struct device_node *eth_node)
{
@@ -1860,90 +817,6 @@ static int prueth_netdev_init(struct prueth *prueth,
return ret;
}
-static void prueth_netdev_exit(struct prueth *prueth,
- struct device_node *eth_node)
-{
- struct prueth_emac *emac;
- enum prueth_mac mac;
-
- mac = prueth_node_mac(eth_node);
- if (mac == PRUETH_MAC_INVALID)
- return;
-
- emac = prueth->emac[mac];
- if (!emac)
- return;
-
- if (of_phy_is_fixed_link(emac->phy_node))
- of_phy_deregister_fixed_link(emac->phy_node);
-
- netif_napi_del(&emac->napi_rx);
-
- pruss_release_mem_region(prueth->pruss, &emac->dram);
- destroy_workqueue(emac->cmd_wq);
- free_netdev(emac->ndev);
- prueth->emac[mac] = NULL;
-}
-
-static int prueth_get_cores(struct prueth *prueth, int slice)
-{
- struct device *dev = prueth->dev;
- enum pruss_pru_id pruss_id;
- struct device_node *np;
- int idx = -1, ret;
-
- np = dev->of_node;
-
- switch (slice) {
- case ICSS_SLICE0:
- idx = 0;
- break;
- case ICSS_SLICE1:
- idx = 3;
- break;
- default:
- return -EINVAL;
- }
-
- prueth->pru[slice] = pru_rproc_get(np, idx, &pruss_id);
- if (IS_ERR(prueth->pru[slice])) {
- ret = PTR_ERR(prueth->pru[slice]);
- prueth->pru[slice] = NULL;
- return dev_err_probe(dev, ret, "unable to get PRU%d\n", slice);
- }
- prueth->pru_id[slice] = pruss_id;
-
- idx++;
- prueth->rtu[slice] = pru_rproc_get(np, idx, NULL);
- if (IS_ERR(prueth->rtu[slice])) {
- ret = PTR_ERR(prueth->rtu[slice]);
- prueth->rtu[slice] = NULL;
- return dev_err_probe(dev, ret, "unable to get RTU%d\n", slice);
- }
-
- idx++;
- prueth->txpru[slice] = pru_rproc_get(np, idx, NULL);
- if (IS_ERR(prueth->txpru[slice])) {
- ret = PTR_ERR(prueth->txpru[slice]);
- prueth->txpru[slice] = NULL;
- return dev_err_probe(dev, ret, "unable to get TX_PRU%d\n", slice);
- }
-
- return 0;
-}
-
-static void prueth_put_cores(struct prueth *prueth, int slice)
-{
- if (prueth->txpru[slice])
- pru_rproc_put(prueth->txpru[slice]);
-
- if (prueth->rtu[slice])
- pru_rproc_put(prueth->rtu[slice]);
-
- if (prueth->pru[slice])
- pru_rproc_put(prueth->pru[slice]);
-}
-
static int prueth_probe(struct platform_device *pdev)
{
struct device_node *eth_node, *eth_ports_node;
@@ -2273,62 +1146,6 @@ static void prueth_remove(struct platform_device *pdev)
prueth_put_cores(prueth, ICSS_SLICE0);
}
-#ifdef CONFIG_PM_SLEEP
-static int prueth_suspend(struct device *dev)
-{
- struct prueth *prueth = dev_get_drvdata(dev);
- struct net_device *ndev;
- int i, ret;
-
- for (i = 0; i < PRUETH_NUM_MACS; i++) {
- ndev = prueth->registered_netdevs[i];
-
- if (!ndev)
- continue;
-
- if (netif_running(ndev)) {
- netif_device_detach(ndev);
- ret = emac_ndo_stop(ndev);
- if (ret < 0) {
- netdev_err(ndev, "failed to stop: %d", ret);
- return ret;
- }
- }
- }
-
- return 0;
-}
-
-static int prueth_resume(struct device *dev)
-{
- struct prueth *prueth = dev_get_drvdata(dev);
- struct net_device *ndev;
- int i, ret;
-
- for (i = 0; i < PRUETH_NUM_MACS; i++) {
- ndev = prueth->registered_netdevs[i];
-
- if (!ndev)
- continue;
-
- if (netif_running(ndev)) {
- ret = emac_ndo_open(ndev);
- if (ret < 0) {
- netdev_err(ndev, "failed to start: %d", ret);
- return ret;
- }
- netif_device_attach(ndev);
- }
- }
-
- return 0;
-}
-#endif /* CONFIG_PM_SLEEP */
-
-static const struct dev_pm_ops prueth_dev_pm_ops = {
- SET_SYSTEM_SLEEP_PM_OPS(prueth_suspend, prueth_resume)
-};
-
static const struct prueth_pdata am654_icssg_pdata = {
.fdqring_mode = K3_RINGACC_RING_MODE_MESSAGE,
.quirk_10m_link_issue = 1,
diff --git a/drivers/net/ethernet/ti/icssg/icssg_prueth.h b/drivers/net/ethernet/ti/icssg/icssg_prueth.h
index 8b6d6b497010..5d792e9bade0 100644
--- a/drivers/net/ethernet/ti/icssg/icssg_prueth.h
+++ b/drivers/net/ethernet/ti/icssg/icssg_prueth.h
@@ -55,6 +55,8 @@
#define ICSSG_NUM_STANDARD_STATS 31
#define ICSSG_NUM_ETHTOOL_STATS (ICSSG_NUM_STATS - ICSSG_NUM_STANDARD_STATS)
+#define IEP_DEFAULT_CYCLE_TIME_NS 1000000 /* 1 ms */
+
/* Firmware status codes */
#define ICSS_HS_FW_READY 0x55555555
#define ICSS_HS_FW_DEAD 0xDEAD0000 /* lower 16 bits contain error code */
@@ -188,6 +190,12 @@ struct prueth_pdata {
u32 quirk_10m_link_issue:1;
};
+struct icssg_firmwares {
+ char *pru;
+ char *rtu;
+ char *txpru;
+};
+
/**
* struct prueth - PRUeth structure
* @dev: device
@@ -257,6 +265,7 @@ static inline int prueth_emac_slice(struct prueth_emac *emac)
}
extern const struct ethtool_ops icssg_ethtool_ops;
+extern const struct dev_pm_ops prueth_dev_pm_ops;
/* Classifier helpers */
void icssg_class_set_mac_addr(struct regmap *miig_rt, int slice, u8 *mac);
@@ -285,4 +294,54 @@ u32 icssg_queue_level(struct prueth *prueth, int queue);
void emac_stats_work_handler(struct work_struct *work);
void emac_update_hardware_stats(struct prueth_emac *emac);
int emac_get_stat_by_name(struct prueth_emac *emac, char *stat_name);
+
+/* Common functions */
+void prueth_cleanup_rx_chns(struct prueth_emac *emac,
+ struct prueth_rx_chn *rx_chn,
+ int max_rflows);
+void prueth_cleanup_tx_chns(struct prueth_emac *emac);
+void prueth_ndev_del_tx_napi(struct prueth_emac *emac, int num);
+void prueth_xmit_free(struct prueth_tx_chn *tx_chn,
+ struct cppi5_host_desc_t *desc);
+int emac_tx_complete_packets(struct prueth_emac *emac, int chn,
+ int budget);
+int prueth_ndev_add_tx_napi(struct prueth_emac *emac);
+int prueth_init_tx_chns(struct prueth_emac *emac);
+int prueth_init_rx_chns(struct prueth_emac *emac,
+ struct prueth_rx_chn *rx_chn,
+ char *name, u32 max_rflows,
+ u32 max_desc_num);
+int prueth_dma_rx_push(struct prueth_emac *emac,
+ struct sk_buff *skb,
+ struct prueth_rx_chn *rx_chn);
+void emac_rx_timestamp(struct prueth_emac *emac,
+ struct sk_buff *skb, u32 *psdata);
+enum netdev_tx emac_ndo_start_xmit(struct sk_buff *skb, struct net_device *ndev);
+irqreturn_t prueth_rx_irq(int irq, void *dev_id);
+void prueth_emac_stop(struct prueth_emac *emac);
+void prueth_cleanup_tx_ts(struct prueth_emac *emac);
+int emac_napi_rx_poll(struct napi_struct *napi_rx, int budget);
+int prueth_prepare_rx_chan(struct prueth_emac *emac,
+ struct prueth_rx_chn *chn,
+ int buf_size);
+void prueth_reset_tx_chan(struct prueth_emac *emac, int ch_num,
+ bool free_skb);
+void prueth_reset_rx_chan(struct prueth_rx_chn *chn,
+ int num_flows, bool disable);
+void emac_ndo_tx_timeout(struct net_device *ndev, unsigned int txqueue);
+int emac_ndo_ioctl(struct net_device *ndev, struct ifreq *ifr, int cmd);
+void emac_ndo_get_stats64(struct net_device *ndev,
+ struct rtnl_link_stats64 *stats);
+int emac_ndo_get_phys_port_name(struct net_device *ndev, char *name,
+ size_t len);
+int prueth_node_port(struct device_node *eth_node);
+int prueth_node_mac(struct device_node *eth_node);
+void prueth_netdev_exit(struct prueth *prueth,
+ struct device_node *eth_node);
+int prueth_get_cores(struct prueth *prueth, int slice);
+void prueth_put_cores(struct prueth *prueth, int slice);
+
+/* Revision specific helper */
+u64 icssg_ts_to_ns(u32 hi_sw, u32 hi, u32 lo, u32 cycle_time_ns);
+
#endif /* __NET_TI_ICSSG_PRUETH_H */
--
2.44.0
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related
* Re: [PATCH v2 16/18] PCI: rockchip-ep: Improve link training
From: Rick Wertenbroek @ 2024-04-03 11:54 UTC (permalink / raw)
To: Damien Le Moal
Cc: Manivannan Sadhasivam, Lorenzo Pieralisi, Kishon Vijay Abraham I,
Shawn Lin, Krzysztof Wilczyński, Bjorn Helgaas,
Heiko Stuebner, linux-pci, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, devicetree, linux-rockchip, linux-arm-kernel,
Wilfred Mallawa, Niklas Cassel
In-Reply-To: <20240330041928.1555578-17-dlemoal@kernel.org>
On Sat, Mar 30, 2024 at 5:20 AM Damien Le Moal <dlemoal@kernel.org> wrote:
>
> The Rockchip rk339 technical reference manual describe the endpoint mode
> link training process clearly and states that:
> Insure link training completion and success by observing link_st field
> in PCIe Client BASIC_STATUS1 register change to 2'b11. If both side
> support PCIe Gen2 speed, re-train can be Initiated by asserting the
> Retrain Link field in Link Control and Status Register. The software
> should insure the BASIC_STATUS0[negotiated_speed] changes to "1", that
> indicates re-train to Gen2 successfully.
> This procedure is very similar to what is done for the root-port mode in
> rockchip_pcie_host_init_port().
>
> Implement this link training procedure for the endpoint mode as well.
> Given that the rk3399 SoC does not have an interrupt signaling link
> status changes, training is implemented as a delayed work which is
> rescheduled until the link training completes or the endpoint controller
> is stopped. The link training work is first scheduled in
> rockchip_pcie_ep_start() when the endpoint function is started. Link
> training completion is signaled to the function using pci_epc_linkup().
> Accordingly, the linkup_notifier field of the rockchip pci_epc_features
> structure is changed to true.
>
> Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
> ---
> drivers/pci/controller/pcie-rockchip-ep.c | 79 ++++++++++++++++++++++-
> drivers/pci/controller/pcie-rockchip.h | 11 ++++
> 2 files changed, 89 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/pci/controller/pcie-rockchip-ep.c b/drivers/pci/controller/pcie-rockchip-ep.c
> index 2767e8f1771d..4006e7dee71a 100644
> --- a/drivers/pci/controller/pcie-rockchip-ep.c
> +++ b/drivers/pci/controller/pcie-rockchip-ep.c
> @@ -16,6 +16,8 @@
> #include <linux/platform_device.h>
> #include <linux/pci-epf.h>
> #include <linux/sizes.h>
> +#include <linux/workqueue.h>
> +#include <linux/iopoll.h>
>
> #include "pcie-rockchip.h"
>
> @@ -48,6 +50,7 @@ struct rockchip_pcie_ep {
> u64 irq_pci_addr;
> u8 irq_pci_fn;
> u8 irq_pending;
> + struct delayed_work link_training;
> };
>
> static void rockchip_pcie_clear_ep_ob_atu(struct rockchip_pcie *rockchip,
> @@ -467,6 +470,8 @@ static int rockchip_pcie_ep_start(struct pci_epc *epc)
> PCIE_CLIENT_CONF_ENABLE,
> PCIE_CLIENT_CONFIG);
>
> + schedule_delayed_work(&ep->link_training, 0);
> +
> return 0;
> }
>
> @@ -475,6 +480,8 @@ static void rockchip_pcie_ep_stop(struct pci_epc *epc)
> struct rockchip_pcie_ep *ep = epc_get_drvdata(epc);
> struct rockchip_pcie *rockchip = &ep->rockchip;
>
> + cancel_delayed_work_sync(&ep->link_training);
> +
> /* Stop link training and disable configuration */
> rockchip_pcie_write(rockchip,
> PCIE_CLIENT_CONF_DISABLE |
> @@ -482,8 +489,77 @@ static void rockchip_pcie_ep_stop(struct pci_epc *epc)
> PCIE_CLIENT_CONFIG);
> }
>
> +static void rockchip_pcie_ep_retrain_link(struct rockchip_pcie *rockchip)
> +{
> + u32 status;
> +
> + status = rockchip_pcie_read(rockchip, PCIE_EP_CONFIG_LCS);
> + status |= PCI_EXP_LNKCTL_RL;
> + rockchip_pcie_write(rockchip, status, PCIE_EP_CONFIG_LCS);
> +}
> +
> +static bool rockchip_pcie_ep_link_up(struct rockchip_pcie *rockchip)
> +{
> + u32 val = rockchip_pcie_read(rockchip, PCIE_CLIENT_BASIC_STATUS1);
> +
> + return PCIE_LINK_UP(val);
> +}
> +
> +static void rockchip_pcie_ep_link_training(struct work_struct *work)
> +{
> + struct rockchip_pcie_ep *ep =
> + container_of(work, struct rockchip_pcie_ep, link_training.work);
> + struct rockchip_pcie *rockchip = &ep->rockchip;
> + struct device *dev = rockchip->dev;
> + u32 val;
> + int ret;
> +
> + /* Enable Gen1 training and wait for its completion */
> + ret = readl_poll_timeout(rockchip->apb_base + PCIE_CORE_CTRL,
> + val, PCIE_LINK_TRAINING_DONE(val), 50,
> + LINK_TRAIN_TIMEOUT);
> + if (ret)
> + goto again;
> +
> + /* Make sure that the link is up */
> + ret = readl_poll_timeout(rockchip->apb_base + PCIE_CLIENT_BASIC_STATUS1,
> + val, PCIE_LINK_UP(val), 50,
> + LINK_TRAIN_TIMEOUT);
> + if (ret)
> + goto again;
> +
> + /* Check the current speed */
> + val = rockchip_pcie_read(rockchip, PCIE_CORE_CTRL);
> + if (!PCIE_LINK_IS_GEN2(val) && rockchip->link_gen == 2) {
> + /* Enable retrain for gen2 */
> + rockchip_pcie_ep_retrain_link(rockchip);
> + readl_poll_timeout(rockchip->apb_base + PCIE_CORE_CTRL,
> + val, PCIE_LINK_IS_GEN2(val), 50,
> + LINK_TRAIN_TIMEOUT);
> + }
> +
> + /* Check again that the link is up */
> + if (!rockchip_pcie_ep_link_up(rockchip))
> + goto again;
> +
> + val = rockchip_pcie_read(rockchip, PCIE_CLIENT_BASIC_STATUS0);
> + dev_info(dev,
> + "Link UP (Negociated speed: %sGT/s, width: x%lu)\n",
> + (val & PCIE_CLIENT_NEG_LINK_SPEED) ? "5" : "2.5",
> + ((val & PCIE_CLIENT_NEG_LINK_WIDTH_MASK) >>
> + PCIE_CLIENT_NEG_LINK_WIDTH_SHIFT) << 1);
> +
This does not print the correct link width for x1 :
# [ 60.518339] rockchip-pcie-ep fd000000.pcie-ep: Link UP
(Negociated speed: 5GT/s, width: x0)
This is because :
((val & PCIE_CLIENT_NEG_LINK_WIDTH_MASK) >>
PCIE_CLIENT_NEG_LINK_WIDTH_SHIFT) << 1
will print 0 if the link width is 1, because bits 7:6 are 0b00, and
0b00 << 1 is still 0. (0b00 => x0, 0b01 => x2, 0b10 => x4)
Therefore the formula should be :
1 << ((val & PCIE_CLIENT_NEG_LINK_WIDTH_MASK) >>
PCIE_CLIENT_NEG_LINK_WIDTH_SHIFT)
This shows the correct link width for all cases (0b00 => x1, 0b01 =>
x2, 0b10 => x4).
Reference : RK3399 TRM V1.3 pages 768-769 PCIE_CLIENT_BASIC_STATUS0
register description
> + /* Notify the function */
> + pci_epc_linkup(ep->epc);
> +
> + return;
> +
> +again:
> + schedule_delayed_work(&ep->link_training, msecs_to_jiffies(5));
> +}
> +
> static const struct pci_epc_features rockchip_pcie_epc_features = {
> - .linkup_notifier = false,
> + .linkup_notifier = true,
> .msi_capable = true,
> .msix_capable = false,
> .align = ROCKCHIP_PCIE_AT_SIZE_ALIGN,
> @@ -644,6 +720,7 @@ static int rockchip_pcie_ep_probe(struct platform_device *pdev)
> rockchip = &ep->rockchip;
> rockchip->is_rc = false;
> rockchip->dev = dev;
> + INIT_DELAYED_WORK(&ep->link_training, rockchip_pcie_ep_link_training);
>
> epc = devm_pci_epc_create(dev, &rockchip_pcie_epc_ops);
> if (IS_ERR(epc)) {
> diff --git a/drivers/pci/controller/pcie-rockchip.h b/drivers/pci/controller/pcie-rockchip.h
> index 0263f158ee8d..3963b7097a91 100644
> --- a/drivers/pci/controller/pcie-rockchip.h
> +++ b/drivers/pci/controller/pcie-rockchip.h
> @@ -26,6 +26,7 @@
> #define MAX_LANE_NUM 4
> #define MAX_REGION_LIMIT 32
> #define MIN_EP_APERTURE 28
> +#define LINK_TRAIN_TIMEOUT (5000 * USEC_PER_MSEC)
>
> #define PCIE_CLIENT_BASE 0x0
> #define PCIE_CLIENT_CONFIG (PCIE_CLIENT_BASE + 0x00)
> @@ -50,6 +51,10 @@
> #define PCIE_CLIENT_DEBUG_LTSSM_MASK GENMASK(5, 0)
> #define PCIE_CLIENT_DEBUG_LTSSM_L1 0x18
> #define PCIE_CLIENT_DEBUG_LTSSM_L2 0x19
> +#define PCIE_CLIENT_BASIC_STATUS0 (PCIE_CLIENT_BASE + 0x44)
> +#define PCIE_CLIENT_NEG_LINK_WIDTH_MASK GENMASK(7, 6)
> +#define PCIE_CLIENT_NEG_LINK_WIDTH_SHIFT 6
> +#define PCIE_CLIENT_NEG_LINK_SPEED BIT(5)
> #define PCIE_CLIENT_BASIC_STATUS1 (PCIE_CLIENT_BASE + 0x48)
> #define PCIE_CLIENT_LINK_STATUS_UP 0x00300000
> #define PCIE_CLIENT_LINK_STATUS_MASK 0x00300000
> @@ -87,6 +92,8 @@
>
> #define PCIE_CORE_CTRL_MGMT_BASE 0x900000
> #define PCIE_CORE_CTRL (PCIE_CORE_CTRL_MGMT_BASE + 0x000)
> +#define PCIE_CORE_PL_CONF_LS_MASK 0x00000001
> +#define PCIE_CORE_PL_CONF_LS_READY 0x00000001
> #define PCIE_CORE_PL_CONF_SPEED_5G 0x00000008
> #define PCIE_CORE_PL_CONF_SPEED_MASK 0x00000018
> #define PCIE_CORE_PL_CONF_LANE_MASK 0x00000006
> @@ -144,6 +151,7 @@
> #define PCIE_RC_CONFIG_BASE 0xa00000
> #define PCIE_EP_CONFIG_BASE 0xa00000
> #define PCIE_EP_CONFIG_DID_VID (PCIE_EP_CONFIG_BASE + 0x00)
> +#define PCIE_EP_CONFIG_LCS (PCIE_EP_CONFIG_BASE + 0xd0)
> #define PCIE_RC_CONFIG_RID_CCR (PCIE_RC_CONFIG_BASE + 0x08)
> #define PCIE_RC_CONFIG_DCR (PCIE_RC_CONFIG_BASE + 0xc4)
> #define PCIE_RC_CONFIG_DCR_CSPL_SHIFT 18
> @@ -155,6 +163,7 @@
> #define PCIE_RC_CONFIG_LINK_CAP (PCIE_RC_CONFIG_BASE + 0xcc)
> #define PCIE_RC_CONFIG_LINK_CAP_L0S BIT(10)
> #define PCIE_RC_CONFIG_LCS (PCIE_RC_CONFIG_BASE + 0xd0)
> +#define PCIE_EP_CONFIG_LCS (PCIE_EP_CONFIG_BASE + 0xd0)
> #define PCIE_RC_CONFIG_L1_SUBSTATE_CTRL2 (PCIE_RC_CONFIG_BASE + 0x90c)
> #define PCIE_RC_CONFIG_THP_CAP (PCIE_RC_CONFIG_BASE + 0x274)
> #define PCIE_RC_CONFIG_THP_CAP_NEXT_MASK GENMASK(31, 20)
> @@ -192,6 +201,8 @@
> #define ROCKCHIP_VENDOR_ID 0x1d87
> #define PCIE_LINK_IS_L2(x) \
> (((x) & PCIE_CLIENT_DEBUG_LTSSM_MASK) == PCIE_CLIENT_DEBUG_LTSSM_L2)
> +#define PCIE_LINK_TRAINING_DONE(x) \
> + (((x) & PCIE_CORE_PL_CONF_LS_MASK) == PCIE_CORE_PL_CONF_LS_READY)
> #define PCIE_LINK_UP(x) \
> (((x) & PCIE_CLIENT_LINK_STATUS_MASK) == PCIE_CLIENT_LINK_STATUS_UP)
> #define PCIE_LINK_IS_GEN2(x) \
> --
> 2.44.0
>
Tested-by: Rick Wertenbroek <rick.wertenbroek@gmail.com>
Best regards,
Rick
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply
* [PATCH net-next v6 09/10] net: ti: icssg-prueth: Modify common functions for SR1.0
From: Diogo Ivo @ 2024-04-03 10:48 UTC (permalink / raw)
To: danishanwar, rogerq, davem, edumazet, kuba, pabeni, andrew,
dan.carpenter, linux-arm-kernel, netdev
Cc: Diogo Ivo, jan.kiszka
In-Reply-To: <20240403104821.283832-1-diogo.ivo@siemens.com>
Some parts of the logic differ only slightly between Silicon Revisions.
In these cases add the bits that differ to a common function that
executes those bits conditionally based on the Silicon Revision.
Based on the work of Roger Quadros, Vignesh Raghavendra and
Grygorii Strashko in TI's 5.10 SDK [1].
[1]: https://git.ti.com/cgit/ti-linux-kernel/ti-linux-kernel/tree/?h=ti-linux-5.10.y
Co-developed-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Diogo Ivo <diogo.ivo@siemens.com>
Reviewed-by: Roger Quadros <rogerq@kernel.org>
Reviewed-by: MD Danish Anwar <danishanwar@ti.com>
---
Changes in v5:
- Remove useless budget++ in emac_tx_complete_packets()
- Added Reviewed-by tags from Roger and Danish
Changes in v4:
- Explicitly check for SR1.0 when managing rxmgm channel
- Pass is_sr1 = false to prueth_get_cores() from SR2.0 driver
drivers/net/ethernet/ti/icssg/icssg_common.c | 45 +++++++++++++++-----
drivers/net/ethernet/ti/icssg/icssg_prueth.c | 4 +-
drivers/net/ethernet/ti/icssg/icssg_prueth.h | 2 +-
3 files changed, 37 insertions(+), 14 deletions(-)
diff --git a/drivers/net/ethernet/ti/icssg/icssg_common.c b/drivers/net/ethernet/ti/icssg/icssg_common.c
index 99f27ecc9352..1d62c05b5f7c 100644
--- a/drivers/net/ethernet/ti/icssg/icssg_common.c
+++ b/drivers/net/ethernet/ti/icssg/icssg_common.c
@@ -152,6 +152,12 @@ int emac_tx_complete_packets(struct prueth_emac *emac, int chn,
desc_dma);
swdata = cppi5_hdesc_get_swdata(desc_tx);
+ /* was this command's TX complete? */
+ if (emac->is_sr1 && *(swdata) == emac->cmd_data) {
+ prueth_xmit_free(tx_chn, desc_tx);
+ continue;
+ }
+
skb = *(swdata);
prueth_xmit_free(tx_chn, desc_tx);
@@ -327,6 +333,7 @@ int prueth_init_rx_chns(struct prueth_emac *emac,
struct net_device *ndev = emac->ndev;
u32 fdqring_id, hdesc_size;
int i, ret = 0, slice;
+ int flow_id_base;
slice = prueth_emac_slice(emac);
if (slice < 0)
@@ -367,8 +374,14 @@ int prueth_init_rx_chns(struct prueth_emac *emac,
goto fail;
}
- emac->rx_flow_id_base = k3_udma_glue_rx_get_flow_id_base(rx_chn->rx_chn);
- netdev_dbg(ndev, "flow id base = %d\n", emac->rx_flow_id_base);
+ flow_id_base = k3_udma_glue_rx_get_flow_id_base(rx_chn->rx_chn);
+ if (emac->is_sr1 && !strcmp(name, "rxmgm")) {
+ emac->rx_mgm_flow_id_base = flow_id_base;
+ netdev_dbg(ndev, "mgm flow id base = %d\n", flow_id_base);
+ } else {
+ emac->rx_flow_id_base = flow_id_base;
+ netdev_dbg(ndev, "flow id base = %d\n", flow_id_base);
+ }
fdqring_id = K3_RINGACC_RING_ID_ANY;
for (i = 0; i < rx_cfg.flow_id_num; i++) {
@@ -477,10 +490,14 @@ void emac_rx_timestamp(struct prueth_emac *emac,
struct skb_shared_hwtstamps *ssh;
u64 ns;
- u32 hi_sw = readl(emac->prueth->shram.va +
- TIMESYNC_FW_WC_COUNT_HI_SW_OFFSET_OFFSET);
- ns = icssg_ts_to_ns(hi_sw, psdata[1], psdata[0],
- IEP_DEFAULT_CYCLE_TIME_NS);
+ if (emac->is_sr1) {
+ ns = (u64)psdata[1] << 32 | psdata[0];
+ } else {
+ u32 hi_sw = readl(emac->prueth->shram.va +
+ TIMESYNC_FW_WC_COUNT_HI_SW_OFFSET_OFFSET);
+ ns = icssg_ts_to_ns(hi_sw, psdata[1], psdata[0],
+ IEP_DEFAULT_CYCLE_TIME_NS);
+ }
ssh = skb_hwtstamps(skb);
memset(ssh, 0, sizeof(*ssh));
@@ -809,7 +826,8 @@ void prueth_emac_stop(struct prueth_emac *emac)
}
emac->fw_running = 0;
- rproc_shutdown(prueth->txpru[slice]);
+ if (!emac->is_sr1)
+ rproc_shutdown(prueth->txpru[slice]);
rproc_shutdown(prueth->rtu[slice]);
rproc_shutdown(prueth->pru[slice]);
}
@@ -829,8 +847,10 @@ void prueth_cleanup_tx_ts(struct prueth_emac *emac)
int emac_napi_rx_poll(struct napi_struct *napi_rx, int budget)
{
struct prueth_emac *emac = prueth_napi_to_emac(napi_rx);
- int rx_flow = PRUETH_RX_FLOW_DATA;
- int flow = PRUETH_MAX_RX_FLOWS;
+ int rx_flow = emac->is_sr1 ?
+ PRUETH_RX_FLOW_DATA_SR1 : PRUETH_RX_FLOW_DATA;
+ int flow = emac->is_sr1 ?
+ PRUETH_MAX_RX_FLOWS_SR1 : PRUETH_MAX_RX_FLOWS;
int num_rx = 0;
int cur_budget;
int ret;
@@ -1082,7 +1102,7 @@ void prueth_netdev_exit(struct prueth *prueth,
prueth->emac[mac] = NULL;
}
-int prueth_get_cores(struct prueth *prueth, int slice)
+int prueth_get_cores(struct prueth *prueth, int slice, bool is_sr1)
{
struct device *dev = prueth->dev;
enum pruss_pru_id pruss_id;
@@ -1096,7 +1116,7 @@ int prueth_get_cores(struct prueth *prueth, int slice)
idx = 0;
break;
case ICSS_SLICE1:
- idx = 3;
+ idx = is_sr1 ? 2 : 3;
break;
default:
return -EINVAL;
@@ -1118,6 +1138,9 @@ int prueth_get_cores(struct prueth *prueth, int slice)
return dev_err_probe(dev, ret, "unable to get RTU%d\n", slice);
}
+ if (is_sr1)
+ return 0;
+
idx++;
prueth->txpru[slice] = pru_rproc_get(np, idx, NULL);
if (IS_ERR(prueth->txpru[slice])) {
diff --git a/drivers/net/ethernet/ti/icssg/icssg_prueth.c b/drivers/net/ethernet/ti/icssg/icssg_prueth.c
index 7d9db9683e18..186b0365c2e5 100644
--- a/drivers/net/ethernet/ti/icssg/icssg_prueth.c
+++ b/drivers/net/ethernet/ti/icssg/icssg_prueth.c
@@ -907,13 +907,13 @@ static int prueth_probe(struct platform_device *pdev)
}
if (eth0_node) {
- ret = prueth_get_cores(prueth, ICSS_SLICE0);
+ ret = prueth_get_cores(prueth, ICSS_SLICE0, false);
if (ret)
goto put_cores;
}
if (eth1_node) {
- ret = prueth_get_cores(prueth, ICSS_SLICE1);
+ ret = prueth_get_cores(prueth, ICSS_SLICE1, false);
if (ret)
goto put_cores;
}
diff --git a/drivers/net/ethernet/ti/icssg/icssg_prueth.h b/drivers/net/ethernet/ti/icssg/icssg_prueth.h
index 5441f2c26430..82e38ef5635b 100644
--- a/drivers/net/ethernet/ti/icssg/icssg_prueth.h
+++ b/drivers/net/ethernet/ti/icssg/icssg_prueth.h
@@ -354,7 +354,7 @@ int prueth_node_port(struct device_node *eth_node);
int prueth_node_mac(struct device_node *eth_node);
void prueth_netdev_exit(struct prueth *prueth,
struct device_node *eth_node);
-int prueth_get_cores(struct prueth *prueth, int slice);
+int prueth_get_cores(struct prueth *prueth, int slice, bool is_sr1);
void prueth_put_cores(struct prueth *prueth, int slice);
/* Revision specific helper */
--
2.44.0
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related
* Re: [PATCH v5 02/10] dt-bindings: mailbox: Add mboxes property for CMDQ secure driver
From: Rob Herring @ 2024-04-03 11:43 UTC (permalink / raw)
To: Shawn Sung
Cc: CK Hu, Krzysztof Kozlowski, linux-kernel, linux-arm-kernel,
Conor Dooley, Jason-JH . Lin, AngeloGioacchino Del Regno,
Houlong Wei, Jassi Brar, devicetree, linux-mediatek,
Matthias Brugger
In-Reply-To: <20240403102602.32155-3-shawn.sung@mediatek.com>
On Wed, 03 Apr 2024 18:25:54 +0800, Shawn Sung wrote:
> From: "Jason-JH.Lin" <jason-jh.lin@mediatek.com>
>
> Add mboxes to define a GCE loopping thread as a secure irq handler.
> This property is only required if CMDQ secure driver is supported.
>
> Signed-off-by: Jason-JH.Lin <jason-jh.lin@mediatek.com>
> Signed-off-by: Hsiao Chien Sung <shawn.sung@mediatek.com>
> ---
> .../bindings/mailbox/mediatek,gce-mailbox.yaml | 10 ++++++++++
> 1 file changed, 10 insertions(+)
>
My bot found errors running 'make DT_CHECKER_FLAGS=-m dt_binding_check'
on your patch (DT_CHECKER_FLAGS is new in v5.13):
yamllint warnings/errors:
dtschema/dtc warnings/errors:
/builds/robherring/dt-review-ci/linux/Documentation/devicetree/bindings/mailbox/mediatek,gce-mailbox.yaml:
Unresolvable JSON pointer: 'definitions/uint32-arrayi'
doc reference errors (make refcheckdocs):
See https://patchwork.ozlabs.org/project/devicetree-bindings/patch/20240403102602.32155-3-shawn.sung@mediatek.com
The base for the series is generally the latest rc1. A different dependency
should be noted in *this* patch.
If you already ran 'make dt_binding_check' and didn't see the above
error(s), then make sure 'yamllint' is installed and dt-schema is up to
date:
pip3 install dtschema --upgrade
Please check and re-submit after running the above command yourself. Note
that DT_SCHEMA_FILES can be set to your schema file to speed up checking
your schema. However, it must be unset to test all examples with your schema.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply
* Re: [PATCH] arm64: tlb: Fix TLBI RANGE operand
From: Gavin Shan @ 2024-04-03 11:37 UTC (permalink / raw)
To: Marc Zyngier
Cc: linux-arm-kernel, linux-kernel, catalin.marinas, will, akpm,
apopple, mark.rutland, ryan.roberts, rananta, yangyicong,
v-songbaohua, yezhenyu2, yihyu, shan.gavin
In-Reply-To: <86edbmu8kn.wl-maz@kernel.org>
On 4/3/24 18:58, Marc Zyngier wrote:
> On Wed, 03 Apr 2024 07:49:29 +0100,
> Gavin Shan <gshan@redhat.com> wrote:
>>
>> KVM/arm64 relies on TLBI RANGE feature to flush TLBs when the dirty
>> bitmap is collected by VMM and the corresponding PTEs need to be
>> write-protected again. Unfortunately, the operand passed to the TLBI
>> RANGE instruction isn't correctly sorted out by commit d1d3aa98b1d4
>> ("arm64: tlb: Use the TLBI RANGE feature in arm64"). It leads to
>> crash on the destination VM after live migration because some of the
>> dirty pages are missed.
>>
>> For example, I have a VM where 8GB memory is assigned, starting from
>> 0x40000000 (1GB). Note that the host has 4KB as the base page size.
>> All TLBs for VM can be covered by one TLBI RANGE operation. However,
>> I receives 0xffff708000040000 as the operand, which is wrong and the
>> correct one should be 0x00007f8000040000. From the wrong operand, we
>> have 3 and 1 for SCALE (bits[45:44) and NUM (bits943:39], only 1GB
>> instead of 8GB memory is covered.
>>
>> Fix the macro __TLBI_RANGE_NUM() so that the correct NUM and TLBI
>> RANGE operand are provided.
>>
>> Fixes: d1d3aa98b1d4 ("arm64: tlb: Use the TLBI RANGE feature in arm64")
>> Cc: stable@kernel.org # v5.10+
>> Reported-by: Yihuang Yu <yihyu@redhat.com>
>> Signed-off-by: Gavin Shan <gshan@redhat.com>
>> ---
>> arch/arm64/include/asm/tlbflush.h | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
>> index 3b0e8248e1a4..07c4fb4b82b4 100644
>> --- a/arch/arm64/include/asm/tlbflush.h
>> +++ b/arch/arm64/include/asm/tlbflush.h
>> @@ -166,7 +166,7 @@ static inline unsigned long get_trans_granule(void)
>> */
>> #define TLBI_RANGE_MASK GENMASK_ULL(4, 0)
>> #define __TLBI_RANGE_NUM(pages, scale) \
>> - ((((pages) >> (5 * (scale) + 1)) & TLBI_RANGE_MASK) - 1)
>> + ((((pages) >> (5 * (scale) + 1)) - 1) & TLBI_RANGE_MASK)
>>
>> /*
>> * TLB Invalidation
>
> This looks pretty wrong, by the very definition of the comment that's
> just above:
>
> <quote>
> /*
> * Generate 'num' values from -1 to 30 with -1 rejected by the
> * __flush_tlb_range() loop below.
> */
> </quote>
>
> With your change, num can't ever be negative, and that breaks
> __flush_tlb_range_op():
>
> <quote>
> num = __TLBI_RANGE_NUM(pages, scale); \
> if (num >= 0) { \
> addr = __TLBI_VADDR_RANGE(start >> shift, asid, \
> scale, num, tlb_level); \
> __tlbi(r##op, addr); \
> if (tlbi_user) \
> __tlbi_user(r##op, addr); \
> start += __TLBI_RANGE_PAGES(num, scale) << PAGE_SHIFT; \
> pages -= __TLBI_RANGE_PAGES(num, scale); \
> } \
> scale--; \
> </quote>
>
> We'll then shove whatever value we've found in the TLBI operation,
> leading to unknown results instead of properly adjusting the scale to
> issue a smaller invalidation.
>
Marc, thanks for your review and comments.
Indeed, this patch is incomplete at least. I think we need __TLBI_RANGE_NUM()
to return [-1 31] instead of [-1 30], to be consistent with MAX_TLBI_RANGE_PAGES.
-1 will be rejected in the following loop. I'm not 100% sure if I did the correct
calculation though.
/*
* Generate 'num' values in range [-1 31], but -1 will be rejected
* by the __flush_tlb_range() loop below.
*/
#define __TLBI_RANGE_NUM(pages, scale) \
({ \
int __next = (pages) & (1ULL << (5 * (scale) + 6)); \
int __mask = ((pages) >> (5 * (scale) + 1)) & TLBI_RANGE_MASK; \
int __num = (((pages) >> (5 * (scale) + 1)) - 1) & \
TLBI_RANGE_MASK; \
(__next || __mask) ? __num : -1; \
})
Alternatively, we can also limit the number of pages to be invalidated from
arch/arm64/kvm/hyp/pgtable.c::kvm_tlb_flush_vmid_range() because the maximal
capacity is (MAX_TLBI_RANGE_PAGES - 1) instead of MAX_TLBI_RANGE_PAGES, as
the comments for __flush_tlb_range_nosync() say.
- inval_pages = min(pages, MAX_TLBI_RANGE_PAGES);
+ inval_pages = min(pages, MAX_TLBI_RANGE_PAGES - 1);
static inline void __flush_tlb_range_nosync(...)
{
:
/*
* When not uses TLB range ops, we can handle up to
* (MAX_DVM_OPS - 1) pages;
* When uses TLB range ops, we can handle up to
* (MAX_TLBI_RANGE_PAGES - 1) pages.
*/
if ((!system_supports_tlb_range() &&
(end - start) >= (MAX_DVM_OPS * stride)) ||
pages >= MAX_TLBI_RANGE_PAGES) {
flush_tlb_mm(vma->vm_mm);
return;
}
}
Please let me know which way is better.
> I think the problem is that you are triggering NUM=31 and SCALE=3,
> which the current code cannot handle as per the comment above
> __flush_tlb_range_op() (we can't do NUM=30 and SCALE=4, obviously).
>
Yes, exactly.
> Can you try the untested patch below?
>
>
> diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
> index 3b0e8248e1a4..b71a1cece802 100644
> --- a/arch/arm64/include/asm/tlbflush.h
> +++ b/arch/arm64/include/asm/tlbflush.h
> @@ -379,10 +379,6 @@ static inline void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch)
> * 3. If there is 1 page remaining, flush it through non-range operations. Range
> * operations can only span an even number of pages. We save this for last to
> * ensure 64KB start alignment is maintained for the LPA2 case.
> - *
> - * Note that certain ranges can be represented by either num = 31 and
> - * scale or num = 0 and scale + 1. The loop below favours the latter
> - * since num is limited to 30 by the __TLBI_RANGE_NUM() macro.
> */
> #define __flush_tlb_range_op(op, start, pages, stride, \
> asid, tlb_level, tlbi_user, lpa2) \
> @@ -407,6 +403,7 @@ do { \
> \
> num = __TLBI_RANGE_NUM(pages, scale); \
> if (num >= 0) { \
> + num += 1; \
> addr = __TLBI_VADDR_RANGE(start >> shift, asid, \
> scale, num, tlb_level); \
> __tlbi(r##op, addr); \
>
Thanks, but I don't think it's going to work. The loop will be running infinitely
because the condition 'if (num >= 0)' can't be met when @pages is 0x200000 when
@scale is 3/2/1/0 until @scale becomes negative and positive again, but @scale
isn't in range [0 3]. I ported the chunk of code to user-space and I can see this
with added printf() messages.
Thanks,
Gavin
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply
* [PATCH v5 9/9] drm/mediatek: Add cmdq_insert_backup_cookie before secure pkt finalize
From: Shawn Sung @ 2024-04-03 10:27 UTC (permalink / raw)
To: Chun-Kuang Hu
Cc: Philipp Zabel, David Airlie, Daniel Vetter, Matthias Brugger,
AngeloGioacchino Del Regno, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, Sumit Semwal, Christian König, dri-devel,
linux-mediatek, linux-kernel, linux-arm-kernel, linux-media,
linaro-mm-sig, Jason-JH.Lin, Hsiao Chien Sung
In-Reply-To: <20240403102701.369-1-shawn.sung@mediatek.com>
From: "Jason-JH.Lin" <jason-jh.lin@mediatek.com>
Add cmdq_insert_backup_cookie to append some commands before EOC:
1. Get GCE HW thread execute count from the GCE HW register.
2. Add 1 to the execute count and then store into a shared memory.
3. Set a software event siganl as secure irq to GCE HW.
Since the value of execute count + 1 is stored in a shared memory,
CMDQ driver in the normal world can use it to handle task done in irq
handler and CMDQ driver in the secure world will use it to schedule
the task slot for each secure thread.
Signed-off-by: Jason-JH.Lin <jason-jh.lin@mediatek.com>
Signed-off-by: Hsiao Chien Sung <shawn.sung@mediatek.com>
---
drivers/gpu/drm/mediatek/mtk_crtc.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/mediatek/mtk_crtc.c b/drivers/gpu/drm/mediatek/mtk_crtc.c
index 8a3c204d48d2b..8a70d731f2ee2 100644
--- a/drivers/gpu/drm/mediatek/mtk_crtc.c
+++ b/drivers/gpu/drm/mediatek/mtk_crtc.c
@@ -186,7 +186,7 @@ void mtk_crtc_disable_secure_state(struct drm_crtc *crtc)
sec_scn = CMDQ_SEC_SCNR_SUB_DISP_DISABLE;
cmdq_sec_pkt_set_data(&mtk_crtc->sec_cmdq_handle, sec_engine, sec_engine, sec_scn);
-
+ cmdq_sec_insert_backup_cookie(&mtk_crtc->sec_cmdq_handle);
cmdq_pkt_finalize(&mtk_crtc->sec_cmdq_handle);
dma_sync_single_for_device(mtk_crtc->sec_cmdq_client.chan->mbox->dev,
mtk_crtc->sec_cmdq_handle.pa_base,
@@ -812,6 +812,8 @@ static void mtk_crtc_update_config(struct mtk_crtc *mtk_crtc, bool needs_vblank)
cmdq_pkt_clear_event(cmdq_handle, mtk_crtc->cmdq_event);
cmdq_pkt_wfe(cmdq_handle, mtk_crtc->cmdq_event, false);
mtk_crtc_ddp_config(crtc, cmdq_handle);
+ if (cmdq_handle->sec_data)
+ cmdq_sec_insert_backup_cookie(cmdq_handle);
cmdq_pkt_finalize(cmdq_handle);
dma_sync_single_for_device(cmdq_client.chan->mbox->dev,
cmdq_handle->pa_base,
--
2.18.0
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related
* [PATCH v5 02/10] dt-bindings: mailbox: Add mboxes property for CMDQ secure driver
From: Shawn Sung @ 2024-04-03 10:25 UTC (permalink / raw)
To: CK Hu, Jassi Brar, AngeloGioacchino Del Regno
Cc: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Matthias Brugger,
Hsiao Chien Sung, Jason-JH . Lin, Houlong Wei, linux-kernel,
devicetree, linux-arm-kernel, linux-mediatek
In-Reply-To: <20240403102602.32155-1-shawn.sung@mediatek.com>
From: "Jason-JH.Lin" <jason-jh.lin@mediatek.com>
Add mboxes to define a GCE loopping thread as a secure irq handler.
This property is only required if CMDQ secure driver is supported.
Signed-off-by: Jason-JH.Lin <jason-jh.lin@mediatek.com>
Signed-off-by: Hsiao Chien Sung <shawn.sung@mediatek.com>
---
.../bindings/mailbox/mediatek,gce-mailbox.yaml | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/Documentation/devicetree/bindings/mailbox/mediatek,gce-mailbox.yaml b/Documentation/devicetree/bindings/mailbox/mediatek,gce-mailbox.yaml
index cef9d76013985..c0d80cc770899 100644
--- a/Documentation/devicetree/bindings/mailbox/mediatek,gce-mailbox.yaml
+++ b/Documentation/devicetree/bindings/mailbox/mediatek,gce-mailbox.yaml
@@ -49,6 +49,16 @@ properties:
items:
- const: gce
+ mediatek,gce-events:
+ description:
+ The event id which is mapping to the specific hardware event signal
+ to gce. The event id is defined in the gce header
+ include/dt-bindings/gce/<chip>-gce.h of each chips.
+ $ref: /schemas/types.yaml#/definitions/uint32-arrayi
+
+ mboxes:
+ maxItems: 1
+
required:
- compatible
- "#mbox-cells"
--
2.18.0
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related
* [PATCH v5 2/9] drm/mediatek: Add secure buffer control flow to mtk_drm_gem
From: Shawn Sung @ 2024-04-03 10:26 UTC (permalink / raw)
To: Chun-Kuang Hu
Cc: Philipp Zabel, David Airlie, Daniel Vetter, Matthias Brugger,
AngeloGioacchino Del Regno, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, Sumit Semwal, Christian König, dri-devel,
linux-mediatek, linux-kernel, linux-arm-kernel, linux-media,
linaro-mm-sig, Jason-JH.Lin, Hsiao Chien Sung
In-Reply-To: <20240403102701.369-1-shawn.sung@mediatek.com>
From: "Jason-JH.Lin" <jason-jh.lin@mediatek.com>
Add secure buffer control flow to mtk_drm_gem.
When user space takes DRM_MTK_GEM_CREATE_ENCRYPTED flag and size
to create a mtk_drm_gem object, mtk_drm_gem will find a matched size
dma buffer from secure dma-heap and bind it to mtk_drm_gem object.
Signed-off-by: Jason-JH.Lin <jason-jh.lin@mediatek.com>
Signed-off-by: Hsiao Chien Sung <shawn.sung@mediatek.com>
---
drivers/gpu/drm/mediatek/mtk_gem.c | 85 +++++++++++++++++++++++++++++-
drivers/gpu/drm/mediatek/mtk_gem.h | 4 ++
2 files changed, 88 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/mediatek/mtk_gem.c b/drivers/gpu/drm/mediatek/mtk_gem.c
index e59e0727717b7..ec34d02c14377 100644
--- a/drivers/gpu/drm/mediatek/mtk_gem.c
+++ b/drivers/gpu/drm/mediatek/mtk_gem.c
@@ -4,6 +4,8 @@
*/
#include <linux/dma-buf.h>
+#include <linux/dma-heap.h>
+#include <uapi/linux/dma-heap.h>
#include <drm/mediatek_drm.h>
#include <drm/drm.h>
@@ -102,6 +104,81 @@ struct mtk_gem_obj *mtk_gem_create(struct drm_device *dev,
return ERR_PTR(ret);
}
+struct mtk_gem_obj *mtk_gem_create_from_heap(struct drm_device *dev,
+ const char *heap, size_t size)
+{
+ struct mtk_drm_private *priv = dev->dev_private;
+ struct mtk_gem_obj *mtk_gem;
+ struct drm_gem_object *obj;
+ struct dma_heap *dma_heap;
+ struct dma_buf *dma_buf;
+ struct dma_buf_attachment *attach;
+ struct sg_table *sgt;
+ struct iosys_map map = {};
+ int ret;
+
+ mtk_gem = mtk_gem_init(dev, size);
+ if (IS_ERR(mtk_gem))
+ return ERR_CAST(mtk_gem);
+
+ obj = &mtk_gem->base;
+
+ dma_heap = dma_heap_find(heap);
+ if (!dma_heap) {
+ DRM_ERROR("heap find fail\n");
+ goto err_gem_free;
+ }
+ dma_buf = dma_heap_buffer_alloc(dma_heap, size,
+ O_RDWR | O_CLOEXEC, DMA_HEAP_VALID_HEAP_FLAGS);
+ if (IS_ERR(dma_buf)) {
+ DRM_ERROR("buffer alloc fail\n");
+ dma_heap_put(dma_heap);
+ goto err_gem_free;
+ }
+ dma_heap_put(dma_heap);
+
+ attach = dma_buf_attach(dma_buf, priv->dma_dev);
+ if (IS_ERR(attach)) {
+ DRM_ERROR("attach fail, return\n");
+ dma_buf_put(dma_buf);
+ goto err_gem_free;
+ }
+
+ sgt = dma_buf_map_attachment(attach, DMA_BIDIRECTIONAL);
+ if (IS_ERR(sgt)) {
+ DRM_ERROR("map failed, detach and return\n");
+ dma_buf_detach(dma_buf, attach);
+ dma_buf_put(dma_buf);
+ goto err_gem_free;
+ }
+ obj->import_attach = attach;
+ mtk_gem->dma_addr = sg_dma_address(sgt->sgl);
+ mtk_gem->sg = sgt;
+ mtk_gem->size = dma_buf->size;
+
+ if (!strcmp(heap, "mtk_svp") || !strcmp(heap, "mtk_svp_cma")) {
+ /* secure buffer can not be mapped */
+ mtk_gem->secure = true;
+ } else {
+ ret = dma_buf_vmap(dma_buf, &map);
+ mtk_gem->kvaddr = map.vaddr;
+ if (ret) {
+ DRM_ERROR("map failed, ret=%d\n", ret);
+ dma_buf_unmap_attachment(attach, sgt, DMA_BIDIRECTIONAL);
+ dma_buf_detach(dma_buf, attach);
+ dma_buf_put(dma_buf);
+ mtk_gem->kvaddr = NULL;
+ }
+ }
+
+ return mtk_gem;
+
+err_gem_free:
+ drm_gem_object_release(obj);
+ kfree(mtk_gem);
+ return ERR_PTR(-ENOMEM);
+}
+
void mtk_gem_free_object(struct drm_gem_object *obj)
{
struct mtk_gem_obj *mtk_gem = to_mtk_gem_obj(obj);
@@ -229,7 +306,9 @@ struct drm_gem_object *mtk_gem_prime_import_sg_table(struct drm_device *dev,
if (IS_ERR(mtk_gem))
return ERR_CAST(mtk_gem);
+ mtk_gem->secure = !sg_page(sg->sgl);
mtk_gem->dma_addr = sg_dma_address(sg->sgl);
+ mtk_gem->size = attach->dmabuf->size;
mtk_gem->sg = sg;
return &mtk_gem->base;
@@ -304,7 +383,11 @@ int mtk_gem_create_ioctl(struct drm_device *dev, void *data,
struct drm_mtk_gem_create *args = data;
int ret;
- mtk_gem = mtk_gem_create(dev, args->size, false);
+ if (args->flags & DRM_MTK_GEM_CREATE_ENCRYPTED)
+ mtk_gem = mtk_gem_create_from_heap(dev, "mtk_svp_cma", args->size);
+ else
+ mtk_gem = mtk_gem_create(dev, args->size, false);
+
if (IS_ERR(mtk_gem))
return PTR_ERR(mtk_gem);
diff --git a/drivers/gpu/drm/mediatek/mtk_gem.h b/drivers/gpu/drm/mediatek/mtk_gem.h
index 4d7598220ca8f..75cf50495abe0 100644
--- a/drivers/gpu/drm/mediatek/mtk_gem.h
+++ b/drivers/gpu/drm/mediatek/mtk_gem.h
@@ -27,9 +27,11 @@ struct mtk_gem_obj {
void *cookie;
void *kvaddr;
dma_addr_t dma_addr;
+ size_t size;
unsigned long dma_attrs;
struct sg_table *sg;
struct page **pages;
+ bool secure;
};
#define to_mtk_gem_obj(x) container_of(x, struct mtk_gem_obj, base)
@@ -37,6 +39,8 @@ struct mtk_gem_obj {
void mtk_gem_free_object(struct drm_gem_object *gem);
struct mtk_gem_obj *mtk_gem_create(struct drm_device *dev, size_t size,
bool alloc_kmap);
+struct mtk_gem_obj *mtk_gem_create_from_heap(struct drm_device *dev,
+ const char *heap, size_t size);
int mtk_gem_dumb_create(struct drm_file *file_priv, struct drm_device *dev,
struct drm_mode_create_dumb *args);
struct sg_table *mtk_gem_prime_get_sg_table(struct drm_gem_object *obj);
--
2.18.0
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox