* [PATCH 0/2] Improve VCHIQ cache line size handling
@ 2018-09-12 15:06 Phil Elwell
[not found] ` <1536764809-132672-1-git-send-email-phil-FnsA7b+Nu9XbIbC87yuRow@public.gmane.org>
0 siblings, 1 reply; 11+ messages in thread
From: Phil Elwell @ 2018-09-12 15:06 UTC (permalink / raw)
To: Rob Herring, Stefan Wahren, Greg Kroah-Hartman, Phil Elwell,
devicetree-u79uwXL29TY76Z2rM5mHXA,
linux-rpi-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
Both sides of the VCHIQ communications mechanism need to agree on the cache
line size. Using an incorrect value can lead to data corruption, but having the
two sides using different values is usually worse.
In the absence of an obvious convenient run-time method to determine the
correct value in the ARCH=arm world, the downstream Raspberry Pi trees used a
Device Tree property, written by the firmware, to configure the kernel driver.
This method was vetoed during the upstreaming process, so a fixed value of 32
was used instead, and some corruptions ensued. This is take 2 at arriving at
the correct value.
Part one of the fix is deriving the correct value from the ARM's cpuid register.
Part two is a (seemingly cosmetic) correction of the Device Tree reg declaration
used by the driver, but it doubles as an indication to the Raspberry Pi firmware
that the kernel driver is running a recent kernel driver that chooses the
correct value. As such I would like very much for the DT patch not to be merged
before the driver patch - just tell me what hoops I need to jump through.
Phil Elwell (2):
staging/vc04_services: Derive g_cache_line_size
ARM: dts: bcm283x: Correct mailbox register sizes
arch/arm/boot/dts/bcm2835-rpi.dtsi | 2 +-
.../interface/vchiq_arm/vchiq_2835_arm.c | 24 +++++++++++++++++-----
2 files changed, 20 insertions(+), 6 deletions(-)
--
2.7.4
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 0/2] Improve VCHIQ cache line size handling
@ 2018-09-12 16:42 Phil Elwell
2018-09-12 16:42 ` [PATCH 1/2] staging/vc04_services: Derive g_cache_line_size Phil Elwell
0 siblings, 1 reply; 11+ messages in thread
From: Phil Elwell @ 2018-09-12 16:42 UTC (permalink / raw)
To: Rob Herring, Stefan Wahren, Greg Kroah-Hartman, Phil Elwell,
devicetree, linux-rpi-kernel, Russell King, Arnd Bergmann,
linux-arm-kernel, bcm-kernel-feedback-list
Both sides of the VCHIQ communications mechanism need to agree on the cache
line size. Using an incorrect value can lead to data corruption, but having the
two sides using different values is usually worse.
In the absence of an obvious convenient run-time method to determine the
correct value in the ARCH=arm world, the downstream Raspberry Pi trees used a
Device Tree property, written by the firmware, to configure the kernel driver.
This method was vetoed during the upstreaming process, so a fixed value of 32
was used instead, and some corruptions ensued. This is take 2 at arriving at
the correct value.
Part one of the fix is deriving the correct value from the ARM's cpuid register.
Part two is a (seemingly cosmetic) correction of the Device Tree reg declaration
used by the driver, but it doubles as an indication to the Raspberry Pi firmware
that the kernel driver is running a recent kernel driver that chooses the
correct value. As such I would like very much for the DT patch not to be merged
before the driver patch - just tell me what hoops I need to jump through.
Phil Elwell (2):
staging/vc04_services: Derive g_cache_line_size
ARM: dts: bcm283x: Correct mailbox register sizes
arch/arm/boot/dts/bcm2835-rpi.dtsi | 2 +-
.../interface/vchiq_arm/vchiq_2835_arm.c | 24 +++++++++++++++++-----
2 files changed, 20 insertions(+), 6 deletions(-)
--
2.7.4
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 1/2] staging/vc04_services: Derive g_cache_line_size
2018-09-12 16:42 Phil Elwell
@ 2018-09-12 16:42 ` Phil Elwell
2018-09-14 10:11 ` Stefan Wahren
0 siblings, 1 reply; 11+ messages in thread
From: Phil Elwell @ 2018-09-12 16:42 UTC (permalink / raw)
To: Rob Herring, Stefan Wahren, Greg Kroah-Hartman, Phil Elwell,
devicetree, linux-rpi-kernel, Russell King, Arnd Bergmann,
linux-arm-kernel, bcm-kernel-feedback-list
The ARM coprocessor registers include dcache line size, but there is no
function to expose this value. Rather than create a new one, use the
read_cpuid_id function to derive the correct value, which is 32 for
BCM2835 and 64 for BCM2836/7.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
---
.../interface/vchiq_arm/vchiq_2835_arm.c | 24 +++++++++++++++++-----
1 file changed, 19 insertions(+), 5 deletions(-)
diff --git a/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_2835_arm.c b/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_2835_arm.c
index e767209..3537f60 100644
--- a/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_2835_arm.c
+++ b/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_2835_arm.c
@@ -42,6 +42,7 @@
#include <linux/uaccess.h>
#include <linux/mm.h>
#include <linux/of.h>
+#include <asm/cputype.h>
#include <soc/bcm2835/raspberrypi-firmware.h>
#define TOTAL_SLOTS (VCHIQ_SLOT_ZERO_SLOTS + 2 * 32)
@@ -81,13 +82,15 @@ static void __iomem *g_regs;
* VPU firmware, which determines the required alignment of the
* offsets/sizes in pagelists.
*
- * Modern VPU firmware looks for a DT "cache-line-size" property in
- * the VCHIQ node and will overwrite it with the actual L2 cache size,
+ * Previous VPU firmware looked for a DT "cache-line-size" property in
+ * the VCHIQ node and would overwrite it with the actual L2 cache size,
* which the kernel must then respect. That property was rejected
- * upstream, so we have to use the VPU firmware's compatibility value
- * of 32.
+ * upstream, so we now rely on both sides to "do the right thing" independently
+ * of the other. To improve backwards compatibility, this new behaviour is
+ * signalled to the firmware by the use of a corrected "reg" property on the
+ * relevant Device Tree node.
*/
-static unsigned int g_cache_line_size = 32;
+static unsigned int g_cache_line_size;
static unsigned int g_fragments_size;
static char *g_fragments_base;
static char *g_free_fragments;
@@ -127,6 +130,17 @@ int vchiq_platform_init(struct platform_device *pdev, VCHIQ_STATE_T *state)
if (err < 0)
return err;
+ /*
+ * The tempting L1_CACHE_BYTES macro doesn't work in the case of
+ * a kernel built with bcm2835_defconfig running on a BCM2836/7
+ * processor, hence the need for a runtime check. The dcache line size
+ * is encoded in one of the coprocessor registers, but there is no
+ * convenient way to access it short of embedded assembler, hence
+ * the use of read_cpuid_id(). The following test evaluates to true
+ * on a BCM2835 showing that it is ARMv6-ish, whereas
+ * cpu_architecture() will indicate that it is an ARMv7.
+ */
+ g_cache_line_size = ((read_cpuid_id() & 0x7f000) == 0x7b000) ? 32 : 64;
g_fragments_size = 2 * g_cache_line_size;
/* Allocate space for the channels in coherent memory */
--
2.7.4
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2] staging/vc04_services: Derive g_cache_line_size
2018-09-12 16:42 ` [PATCH 1/2] staging/vc04_services: Derive g_cache_line_size Phil Elwell
@ 2018-09-14 10:11 ` Stefan Wahren
2018-09-14 10:26 ` Phil Elwell
[not found] ` <107b707e-1f8c-1305-2582-0a131011758d-eS4NqCHxEME@public.gmane.org>
0 siblings, 2 replies; 11+ messages in thread
From: Stefan Wahren @ 2018-09-14 10:11 UTC (permalink / raw)
To: Phil Elwell, Rob Herring, Greg Kroah-Hartman, devicetree,
linux-rpi-kernel, Russell King, Arnd Bergmann, linux-arm-kernel,
bcm-kernel-feedback-list
Hi Phil,
Am 12.09.2018 um 18:42 schrieb Phil Elwell:
> The ARM coprocessor registers include dcache line size, but there is no
> function to expose this value. Rather than create a new one, use the
> read_cpuid_id function to derive the correct value, which is 32 for
> BCM2835 and 64 for BCM2836/7.
>
> Signed-off-by: Phil Elwell <phil@raspberrypi.org>
> ---
> .../interface/vchiq_arm/vchiq_2835_arm.c | 24 +++++++++++++++++-----
> 1 file changed, 19 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_2835_arm.c b/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_2835_arm.c
> index e767209..3537f60 100644
> --- a/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_2835_arm.c
> +++ b/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_2835_arm.c
> @@ -42,6 +42,7 @@
> #include <linux/uaccess.h>
> #include <linux/mm.h>
> #include <linux/of.h>
> +#include <asm/cputype.h>
> #include <soc/bcm2835/raspberrypi-firmware.h>
>
> #define TOTAL_SLOTS (VCHIQ_SLOT_ZERO_SLOTS + 2 * 32)
> @@ -81,13 +82,15 @@ static void __iomem *g_regs;
> * VPU firmware, which determines the required alignment of the
> * offsets/sizes in pagelists.
> *
> - * Modern VPU firmware looks for a DT "cache-line-size" property in
> - * the VCHIQ node and will overwrite it with the actual L2 cache size,
> + * Previous VPU firmware looked for a DT "cache-line-size" property in
> + * the VCHIQ node and would overwrite it with the actual L2 cache size,
> * which the kernel must then respect. That property was rejected
> - * upstream, so we have to use the VPU firmware's compatibility value
> - * of 32.
> + * upstream, so we now rely on both sides to "do the right thing" independently
> + * of the other. To improve backwards compatibility, this new behaviour is
> + * signalled to the firmware by the use of a corrected "reg" property on the
> + * relevant Device Tree node.
> */
> -static unsigned int g_cache_line_size = 32;
> +static unsigned int g_cache_line_size;
> static unsigned int g_fragments_size;
> static char *g_fragments_base;
> static char *g_free_fragments;
> @@ -127,6 +130,17 @@ int vchiq_platform_init(struct platform_device *pdev, VCHIQ_STATE_T *state)
> if (err < 0)
> return err;
>
> + /*
> + * The tempting L1_CACHE_BYTES macro doesn't work in the case of
> + * a kernel built with bcm2835_defconfig running on a BCM2836/7
> + * processor, hence the need for a runtime check. The dcache line size
> + * is encoded in one of the coprocessor registers, but there is no
> + * convenient way to access it short of embedded assembler, hence
> + * the use of read_cpuid_id(). The following test evaluates to true
> + * on a BCM2835 showing that it is ARMv6-ish, whereas
> + * cpu_architecture() will indicate that it is an ARMv7.
> + */
> + g_cache_line_size = ((read_cpuid_id() & 0x7f000) == 0x7b000) ? 32 : 64;
is there a chance to use ARM_CPU_PART_ARM1176 or something else instead
of this magic numbers?
Sorry, i cannot remember the whole discussion but why we can use
different compatible here instead of using ARM specific stuff.
compatible = "brcm,bcm2836-vchiq", "brcm,bcm2835-vchiq"; // for BCM2836/7
compatible = "brcm,bcm2835"; // only for BCM2835
> g_fragments_size = 2 * g_cache_line_size;
>
> /* Allocate space for the channels in coherent memory */
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2] staging/vc04_services: Derive g_cache_line_size
2018-09-14 10:11 ` Stefan Wahren
@ 2018-09-14 10:26 ` Phil Elwell
2018-09-14 11:03 ` Stefan Wahren
[not found] ` <107b707e-1f8c-1305-2582-0a131011758d-eS4NqCHxEME@public.gmane.org>
1 sibling, 1 reply; 11+ messages in thread
From: Phil Elwell @ 2018-09-14 10:26 UTC (permalink / raw)
To: Stefan Wahren, Rob Herring, Greg Kroah-Hartman, devicetree,
linux-rpi-kernel, Russell King, Arnd Bergmann, linux-arm-kernel,
bcm-kernel-feedback-list
Hi Stefan,
On 14/09/2018 11:11, Stefan Wahren wrote:
> Hi Phil,
>
> Am 12.09.2018 um 18:42 schrieb Phil Elwell:
>> The ARM coprocessor registers include dcache line size, but there is no
>> function to expose this value. Rather than create a new one, use the
>> read_cpuid_id function to derive the correct value, which is 32 for
>> BCM2835 and 64 for BCM2836/7.
>>
>> Signed-off-by: Phil Elwell <phil@raspberrypi.org>
>> ---
>> .../interface/vchiq_arm/vchiq_2835_arm.c | 24 +++++++++++++++++-----
>> 1 file changed, 19 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_2835_arm.c b/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_2835_arm.c
>> index e767209..3537f60 100644
>> --- a/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_2835_arm.c
>> +++ b/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_2835_arm.c
>> @@ -42,6 +42,7 @@
>> #include <linux/uaccess.h>
>> #include <linux/mm.h>
>> #include <linux/of.h>
>> +#include <asm/cputype.h>
>> #include <soc/bcm2835/raspberrypi-firmware.h>
>>
>> #define TOTAL_SLOTS (VCHIQ_SLOT_ZERO_SLOTS + 2 * 32)
>> @@ -81,13 +82,15 @@ static void __iomem *g_regs;
>> * VPU firmware, which determines the required alignment of the
>> * offsets/sizes in pagelists.
>> *
>> - * Modern VPU firmware looks for a DT "cache-line-size" property in
>> - * the VCHIQ node and will overwrite it with the actual L2 cache size,
>> + * Previous VPU firmware looked for a DT "cache-line-size" property in
>> + * the VCHIQ node and would overwrite it with the actual L2 cache size,
>> * which the kernel must then respect. That property was rejected
>> - * upstream, so we have to use the VPU firmware's compatibility value
>> - * of 32.
>> + * upstream, so we now rely on both sides to "do the right thing" independently
>> + * of the other. To improve backwards compatibility, this new behaviour is
>> + * signalled to the firmware by the use of a corrected "reg" property on the
>> + * relevant Device Tree node.
>> */
>> -static unsigned int g_cache_line_size = 32;
>> +static unsigned int g_cache_line_size;
>> static unsigned int g_fragments_size;
>> static char *g_fragments_base;
>> static char *g_free_fragments;
>> @@ -127,6 +130,17 @@ int vchiq_platform_init(struct platform_device *pdev, VCHIQ_STATE_T *state)
>> if (err < 0)
>> return err;
>>
>> + /*
>> + * The tempting L1_CACHE_BYTES macro doesn't work in the case of
>> + * a kernel built with bcm2835_defconfig running on a BCM2836/7
>> + * processor, hence the need for a runtime check. The dcache line size
>> + * is encoded in one of the coprocessor registers, but there is no
>> + * convenient way to access it short of embedded assembler, hence
>> + * the use of read_cpuid_id(). The following test evaluates to true
>> + * on a BCM2835 showing that it is ARMv6-ish, whereas
>> + * cpu_architecture() will indicate that it is an ARMv7.
>> + */
>> + g_cache_line_size = ((read_cpuid_id() & 0x7f000) == 0x7b000) ? 32 : 64;
>
> is there a chance to use ARM_CPU_PART_ARM1176 or something else instead
> of this magic numbers?
One could. It's a trade-off between specific and human-readable vs. generic and less-so,
and possibly moot given the next question.
> Sorry, i cannot remember the whole discussion but why we can use
> different compatible here instead of using ARM specific stuff.
>
> compatible = "brcm,bcm2836-vchiq", "brcm,bcm2835-vchiq"; // for BCM2836/7
>
> compatible = "brcm,bcm2835"; // only for BCM2835
It wasn't an option you raised in the offline discussion. Assuming you meant
"brcm,bcm2835-vchiq", I suppose you could think of it as being a different type
of device if that is more acceptable.
>> g_fragments_size = 2 * g_cache_line_size;
>>
>> /* Allocate space for the channels in coherent memory */
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2] staging/vc04_services: Derive g_cache_line_size
2018-09-14 10:26 ` Phil Elwell
@ 2018-09-14 11:03 ` Stefan Wahren
2018-09-14 11:09 ` Phil Elwell
0 siblings, 1 reply; 11+ messages in thread
From: Stefan Wahren @ 2018-09-14 11:03 UTC (permalink / raw)
To: Phil Elwell, Rob Herring, Greg Kroah-Hartman, devicetree,
linux-rpi-kernel, Russell King, Arnd Bergmann, linux-arm-kernel,
bcm-kernel-feedback-list
Hi,
Am 14.09.2018 um 12:26 schrieb Phil Elwell:
> Hi Stefan,
>
> On 14/09/2018 11:11, Stefan Wahren wrote:
>> Hi Phil,
>>
>> Am 12.09.2018 um 18:42 schrieb Phil Elwell:
>>> The ARM coprocessor registers include dcache line size, but there is no
>>> function to expose this value. Rather than create a new one, use the
>>> read_cpuid_id function to derive the correct value, which is 32 for
>>> BCM2835 and 64 for BCM2836/7.
>>>
>>> Signed-off-by: Phil Elwell <phil@raspberrypi.org>
>>> ---
>>> .../interface/vchiq_arm/vchiq_2835_arm.c | 24 +++++++++++++++++-----
>>> 1 file changed, 19 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_2835_arm.c b/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_2835_arm.c
>>> index e767209..3537f60 100644
>>> --- a/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_2835_arm.c
>>> +++ b/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_2835_arm.c
>>> @@ -42,6 +42,7 @@
>>> #include <linux/uaccess.h>
>>> #include <linux/mm.h>
>>> #include <linux/of.h>
>>> +#include <asm/cputype.h>
>>> #include <soc/bcm2835/raspberrypi-firmware.h>
>>>
>>> #define TOTAL_SLOTS (VCHIQ_SLOT_ZERO_SLOTS + 2 * 32)
>>> @@ -81,13 +82,15 @@ static void __iomem *g_regs;
>>> * VPU firmware, which determines the required alignment of the
>>> * offsets/sizes in pagelists.
>>> *
>>> - * Modern VPU firmware looks for a DT "cache-line-size" property in
>>> - * the VCHIQ node and will overwrite it with the actual L2 cache size,
>>> + * Previous VPU firmware looked for a DT "cache-line-size" property in
>>> + * the VCHIQ node and would overwrite it with the actual L2 cache size,
>>> * which the kernel must then respect. That property was rejected
>>> - * upstream, so we have to use the VPU firmware's compatibility value
>>> - * of 32.
>>> + * upstream, so we now rely on both sides to "do the right thing" independently
>>> + * of the other. To improve backwards compatibility, this new behaviour is
>>> + * signalled to the firmware by the use of a corrected "reg" property on the
>>> + * relevant Device Tree node.
>>> */
>>> -static unsigned int g_cache_line_size = 32;
>>> +static unsigned int g_cache_line_size;
>>> static unsigned int g_fragments_size;
>>> static char *g_fragments_base;
>>> static char *g_free_fragments;
>>> @@ -127,6 +130,17 @@ int vchiq_platform_init(struct platform_device *pdev, VCHIQ_STATE_T *state)
>>> if (err < 0)
>>> return err;
>>>
>>> + /*
>>> + * The tempting L1_CACHE_BYTES macro doesn't work in the case of
>>> + * a kernel built with bcm2835_defconfig running on a BCM2836/7
>>> + * processor, hence the need for a runtime check. The dcache line size
>>> + * is encoded in one of the coprocessor registers, but there is no
>>> + * convenient way to access it short of embedded assembler, hence
>>> + * the use of read_cpuid_id(). The following test evaluates to true
>>> + * on a BCM2835 showing that it is ARMv6-ish, whereas
>>> + * cpu_architecture() will indicate that it is an ARMv7.
>>> + */
>>> + g_cache_line_size = ((read_cpuid_id() & 0x7f000) == 0x7b000) ? 32 : 64;
>> is there a chance to use ARM_CPU_PART_ARM1176 or something else instead
>> of this magic numbers?
> One could. It's a trade-off between specific and human-readable vs. generic and less-so,
> and possibly moot given the next question.
>
>> Sorry, i cannot remember the whole discussion but why we can use
>> different compatible here instead of using ARM specific stuff.
>>
>> compatible = "brcm,bcm2836-vchiq", "brcm,bcm2835-vchiq"; // for BCM2836/7
>>
>> compatible = "brcm,bcm2835"; // only for BCM2835
> It wasn't an option you raised in the offline discussion.
maybe this was too obvious :-(
> Assuming you meant
> "brcm,bcm2835-vchiq", I suppose you could think of it as being a different type
> of device if that is more acceptable.
Yes, it's a different SoC. Do you see any problems with the VC4 firmware
according to this approach?
>
>>> g_fragments_size = 2 * g_cache_line_size;
>>>
>>> /* Allocate space for the channels in coherent memory */
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2] staging/vc04_services: Derive g_cache_line_size
2018-09-14 11:03 ` Stefan Wahren
@ 2018-09-14 11:09 ` Phil Elwell
[not found] ` <2aaa0f5f-b54d-396c-737e-73591b3083c8-FnsA7b+Nu9XbIbC87yuRow@public.gmane.org>
0 siblings, 1 reply; 11+ messages in thread
From: Phil Elwell @ 2018-09-14 11:09 UTC (permalink / raw)
To: Stefan Wahren, Rob Herring, Greg Kroah-Hartman, devicetree,
linux-rpi-kernel, Russell King, Arnd Bergmann, linux-arm-kernel,
bcm-kernel-feedback-list
On 14/09/2018 12:03, Stefan Wahren wrote:
> Hi,
>
> Am 14.09.2018 um 12:26 schrieb Phil Elwell:
>> Hi Stefan,
>>
>> On 14/09/2018 11:11, Stefan Wahren wrote:
>>> Hi Phil,
>>>
>>> Am 12.09.2018 um 18:42 schrieb Phil Elwell:
>>>> The ARM coprocessor registers include dcache line size, but there is no
>>>> function to expose this value. Rather than create a new one, use the
>>>> read_cpuid_id function to derive the correct value, which is 32 for
>>>> BCM2835 and 64 for BCM2836/7.
>>>>
>>>> Signed-off-by: Phil Elwell <phil@raspberrypi.org>
>>>> ---
>>>> .../interface/vchiq_arm/vchiq_2835_arm.c | 24 +++++++++++++++++-----
>>>> 1 file changed, 19 insertions(+), 5 deletions(-)
>>>>
>>>> diff --git a/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_2835_arm.c b/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_2835_arm.c
>>>> index e767209..3537f60 100644
>>>> --- a/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_2835_arm.c
>>>> +++ b/drivers/staging/vc04_services/interface/vchiq_arm/vchiq_2835_arm.c
>>>> @@ -42,6 +42,7 @@
>>>> #include <linux/uaccess.h>
>>>> #include <linux/mm.h>
>>>> #include <linux/of.h>
>>>> +#include <asm/cputype.h>
>>>> #include <soc/bcm2835/raspberrypi-firmware.h>
>>>>
>>>> #define TOTAL_SLOTS (VCHIQ_SLOT_ZERO_SLOTS + 2 * 32)
>>>> @@ -81,13 +82,15 @@ static void __iomem *g_regs;
>>>> * VPU firmware, which determines the required alignment of the
>>>> * offsets/sizes in pagelists.
>>>> *
>>>> - * Modern VPU firmware looks for a DT "cache-line-size" property in
>>>> - * the VCHIQ node and will overwrite it with the actual L2 cache size,
>>>> + * Previous VPU firmware looked for a DT "cache-line-size" property in
>>>> + * the VCHIQ node and would overwrite it with the actual L2 cache size,
>>>> * which the kernel must then respect. That property was rejected
>>>> - * upstream, so we have to use the VPU firmware's compatibility value
>>>> - * of 32.
>>>> + * upstream, so we now rely on both sides to "do the right thing" independently
>>>> + * of the other. To improve backwards compatibility, this new behaviour is
>>>> + * signalled to the firmware by the use of a corrected "reg" property on the
>>>> + * relevant Device Tree node.
>>>> */
>>>> -static unsigned int g_cache_line_size = 32;
>>>> +static unsigned int g_cache_line_size;
>>>> static unsigned int g_fragments_size;
>>>> static char *g_fragments_base;
>>>> static char *g_free_fragments;
>>>> @@ -127,6 +130,17 @@ int vchiq_platform_init(struct platform_device *pdev, VCHIQ_STATE_T *state)
>>>> if (err < 0)
>>>> return err;
>>>>
>>>> + /*
>>>> + * The tempting L1_CACHE_BYTES macro doesn't work in the case of
>>>> + * a kernel built with bcm2835_defconfig running on a BCM2836/7
>>>> + * processor, hence the need for a runtime check. The dcache line size
>>>> + * is encoded in one of the coprocessor registers, but there is no
>>>> + * convenient way to access it short of embedded assembler, hence
>>>> + * the use of read_cpuid_id(). The following test evaluates to true
>>>> + * on a BCM2835 showing that it is ARMv6-ish, whereas
>>>> + * cpu_architecture() will indicate that it is an ARMv7.
>>>> + */
>>>> + g_cache_line_size = ((read_cpuid_id() & 0x7f000) == 0x7b000) ? 32 : 64;
>>> is there a chance to use ARM_CPU_PART_ARM1176 or something else instead
>>> of this magic numbers?
>> One could. It's a trade-off between specific and human-readable vs. generic and less-so,
>> and possibly moot given the next question.
>>
>>> Sorry, i cannot remember the whole discussion but why we can use
>>> different compatible here instead of using ARM specific stuff.
>>>
>>> compatible = "brcm,bcm2836-vchiq", "brcm,bcm2835-vchiq"; // for BCM2836/7
>>>
>>> compatible = "brcm,bcm2835"; // only for BCM2835
>> It wasn't an option you raised in the offline discussion.
>
> maybe this was too obvious :-(
>
>> Assuming you meant
>> "brcm,bcm2835-vchiq", I suppose you could think of it as being a different type
>> of device if that is more acceptable.
>
> Yes, it's a different SoC. Do you see any problems with the VC4 firmware
> according to this approach?
No, it can be made to work. Expect V2 shortly.
>
>>
>>>> g_fragments_size = 2 * g_cache_line_size;
>>>>
>>>> /* Allocate space for the channels in coherent memory */
>> _______________________________________________
>> linux-arm-kernel mailing list
>> linux-arm-kernel@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>
>
^ permalink raw reply [flat|nested] 11+ messages in thread
[parent not found: <107b707e-1f8c-1305-2582-0a131011758d-eS4NqCHxEME@public.gmane.org>]
* Re: [PATCH 1/2] staging/vc04_services: Derive g_cache_line_size
[not found] ` <107b707e-1f8c-1305-2582-0a131011758d-eS4NqCHxEME@public.gmane.org>
@ 2018-09-14 10:41 ` Russell King - ARM Linux
0 siblings, 0 replies; 11+ messages in thread
From: Russell King - ARM Linux @ 2018-09-14 10:41 UTC (permalink / raw)
To: Stefan Wahren
Cc: devicetree-u79uwXL29TY76Z2rM5mHXA, Arnd Bergmann,
Greg Kroah-Hartman, Rob Herring,
bcm-kernel-feedback-list-dY08KVG/lbpWk0Htik3J/w,
linux-rpi-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
On Fri, Sep 14, 2018 at 12:11:54PM +0200, Stefan Wahren wrote:
> Hi Phil,
>
> Am 12.09.2018 um 18:42 schrieb Phil Elwell:
> > + /*
> > + * The tempting L1_CACHE_BYTES macro doesn't work in the case of
> > + * a kernel built with bcm2835_defconfig running on a BCM2836/7
> > + * processor, hence the need for a runtime check. The dcache line size
> > + * is encoded in one of the coprocessor registers, but there is no
> > + * convenient way to access it short of embedded assembler, hence
> > + * the use of read_cpuid_id(). The following test evaluates to true
> > + * on a BCM2835 showing that it is ARMv6-ish, whereas
> > + * cpu_architecture() will indicate that it is an ARMv7.
> > + */
> > + g_cache_line_size = ((read_cpuid_id() & 0x7f000) == 0x7b000) ? 32 : 64;
>
> is there a chance to use ARM_CPU_PART_ARM1176 or something else instead
> of this magic numbers?
In any case, reading the CPU ID and masking it in this way is really
not on.
The above checks that the ID _may_ be a "new ID" format, and that the
top four bits of the CPU part number are 11. However, there are three
formats of the CPU ID register which can conflict if you do this kind
of trivial mask-and-test.
The relevant cases for are:
bits 19:16 inclusive being all ones define whether the ID is "new ID"
format with the additional CPU ID registers. This is used for all
modern CPUs.
Otherwise, if bit 19 is clear, it is an older ID format, and bits
18:16 may contain '7'.
The value of bits 15:4 depend on the implementer - "1011" in 15:12 are
used by Intel for StrongARM1110 (though they won't have '7' in 18:16),
ARM Ltd for ARM1136, ARM1156, ARM1176 and ARM11MPCore (which will have
'7' in 18:16) - and hence will match this test.
So, the above code does not match just BCM2835, but also matches other
CPUs.
There is a reason why the old macros that allow the part number to be
fetched in asm/cputype.h are marked as deprecated, and that is to
encourage correct behaviour - to use both the part number and
implementer fields.
The comment is also wrong. It claims that in order to read the cache
line size, one needs to use assembler. asm/cputype.h has macros and
inline functions (of which read_cpuid_id() is one of them) to allow
access without having assembler encoded in drivers. However, there is
a gotcha - the format of the cache type register also depends on the
CPU architecture, so it's not a simple case of grabbing the value of a
bitfield.
--
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 13.8Mbps down 630kbps up
According to speedtest.net: 13Mbps down 490kbps up
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2018-09-14 11:25 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-09-12 15:06 [PATCH 0/2] Improve VCHIQ cache line size handling Phil Elwell
[not found] ` <1536764809-132672-1-git-send-email-phil-FnsA7b+Nu9XbIbC87yuRow@public.gmane.org>
2018-09-12 15:06 ` [PATCH 1/2] staging/vc04_services: Derive g_cache_line_size Phil Elwell
2018-09-12 15:06 ` [PATCH 2/2] ARM: dts: bcm283x: Correct mailbox register sizes Phil Elwell
2018-09-12 16:14 ` [PATCH 0/2] Improve VCHIQ cache line size handling Stefan Wahren
-- strict thread matches above, loose matches on Subject: below --
2018-09-12 16:42 Phil Elwell
2018-09-12 16:42 ` [PATCH 1/2] staging/vc04_services: Derive g_cache_line_size Phil Elwell
2018-09-14 10:11 ` Stefan Wahren
2018-09-14 10:26 ` Phil Elwell
2018-09-14 11:03 ` Stefan Wahren
2018-09-14 11:09 ` Phil Elwell
[not found] ` <2aaa0f5f-b54d-396c-737e-73591b3083c8-FnsA7b+Nu9XbIbC87yuRow@public.gmane.org>
2018-09-14 11:25 ` Stefan Wahren
[not found] ` <107b707e-1f8c-1305-2582-0a131011758d-eS4NqCHxEME@public.gmane.org>
2018-09-14 10:41 ` Russell King - ARM Linux
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).