* [U-Boot] armv7 DMA and cache mangement functions
@ 2015-08-24 7:54 Markus Niebel
2015-08-27 17:19 ` Markus Niebel
2015-08-27 17:41 ` Mark Rutland
0 siblings, 2 replies; 4+ messages in thread
From: Markus Niebel @ 2015-08-24 7:54 UTC (permalink / raw)
To: u-boot
Hello,
I'm not an expert in the low level details of this area. So please sorry if there are
wrong assumptions in this post post.
Hardware: i.MX6 Solo (TQMa6 on custom Mainboard)
U-Boot: 2014.10
gcc: 4.8.3
We see an error using TFTP on i.MX6 that seems to triggered, if the code / data size goes
over a limit. Code changes have nothing to do with network stack, network drivers,
memory mangement. TFTP will completely unusable: device sees frequently erroneous packages
with different of wierd errors. If code stays below this size all works fine.
Up to now we checked a lot of things. The following brought us to the assumption, that this
could be cache related:
dynamically disable data cache before doing TFTP: TFTP works well again
running with disabled L2 cache (data cache enabled): TFTP works well again
Looking at the code in drivers/net/fec_mxc.c, function fec_recv we see a call to
invalidate_dcache_range before accessing the received ethernet data. When looking at
the code for invalidate_dcache_range in arch/arm/cpu/armv7/cache_v7.c an comparing
how the things done in linux and barebox we noticed that the order of L2 chache / data cache
invalidation is just swapped there. Applying this to the receive code for fec_mxc,
TFTP will work again.
Question: is the order of cache invalidation important?
Thanks
Markus
^ permalink raw reply [flat|nested] 4+ messages in thread
* [U-Boot] armv7 DMA and cache mangement functions
2015-08-24 7:54 [U-Boot] armv7 DMA and cache mangement functions Markus Niebel
@ 2015-08-27 17:19 ` Markus Niebel
2015-08-27 17:41 ` Mark Rutland
1 sibling, 0 replies; 4+ messages in thread
From: Markus Niebel @ 2015-08-27 17:19 UTC (permalink / raw)
To: u-boot
just a friendly ping:
Am 24.08.2015 um 09:54 schrieb Markus Niebel:
> Hello,
>
> I'm not an expert in the low level details of this area. So please sorry if there are
> wrong assumptions in this post post.
>
> Hardware: i.MX6 Solo (TQMa6 on custom Mainboard)
> U-Boot: 2014.10
> gcc: 4.8.3
>
> We see an error using TFTP on i.MX6 that seems to triggered, if the code / data size goes
> over a limit. Code changes have nothing to do with network stack, network drivers,
> memory mangement. TFTP will completely unusable: device sees frequently erroneous packages
> with different of wierd errors. If code stays below this size all works fine.
>
> Up to now we checked a lot of things. The following brought us to the assumption, that this
> could be cache related:
>
> dynamically disable data cache before doing TFTP: TFTP works well again
> running with disabled L2 cache (data cache enabled): TFTP works well again
>
> Looking at the code in drivers/net/fec_mxc.c, function fec_recv we see a call to
> invalidate_dcache_range before accessing the received ethernet data. When looking at
> the code for invalidate_dcache_range in arch/arm/cpu/armv7/cache_v7.c an comparing
> how the things done in linux and barebox we noticed that the order of L2 chache / data cache
> invalidation is just swapped there. Applying this to the receive code for fec_mxc,
> TFTP will work again.
>
> Question: is the order of cache invalidation important?
>
> Thanks
>
> Markus
> _______________________________________________
> U-Boot mailing list
> U-Boot at lists.denx.de
> http://lists.denx.de/mailman/listinfo/u-boot
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* [U-Boot] armv7 DMA and cache mangement functions
2015-08-24 7:54 [U-Boot] armv7 DMA and cache mangement functions Markus Niebel
2015-08-27 17:19 ` Markus Niebel
@ 2015-08-27 17:41 ` Mark Rutland
2015-08-28 8:43 ` Markus Niebel
1 sibling, 1 reply; 4+ messages in thread
From: Mark Rutland @ 2015-08-27 17:41 UTC (permalink / raw)
To: u-boot
On Mon, Aug 24, 2015 at 08:54:17AM +0100, Markus Niebel wrote:
> Hello,
>
> I'm not an expert in the low level details of this area. So please sorry if there are
> wrong assumptions in this post post.
>
> Hardware: i.MX6 Solo (TQMa6 on custom Mainboard)
> U-Boot: 2014.10
> gcc: 4.8.3
>
> We see an error using TFTP on i.MX6 that seems to triggered, if the code / data size goes
> over a limit. Code changes have nothing to do with network stack, network drivers,
> memory mangement. TFTP will completely unusable: device sees frequently erroneous packages
> with different of wierd errors. If code stays below this size all works fine.
>
> Up to now we checked a lot of things. The following brought us to the assumption, that this
> could be cache related:
>
> dynamically disable data cache before doing TFTP: TFTP works well again
> running with disabled L2 cache (data cache enabled): TFTP works well again
>
> Looking at the code in drivers/net/fec_mxc.c, function fec_recv we see a call to
> invalidate_dcache_range before accessing the received ethernet data. When looking at
> the code for invalidate_dcache_range in arch/arm/cpu/armv7/cache_v7.c an comparing
> how the things done in linux and barebox we noticed that the order of L2 chache / data cache
> invalidation is just swapped there. Applying this to the receive code for fec_mxc,
> TFTP will work again.
>
> Question: is the order of cache invalidation important?
The order is important.
Consider the case where both the external and architected caches contain
stale (but clean) cache lines for the region you care about.
If you invalidate the architected caches before the external L2, the
architected caches may speculatively fetch (stale) data from the L2
before the L2 is cleaned, and so in the end you may still see stale
data in the architected caches.
If you invalidate the L2 first, the architected caches could
speculatively fetch from the L2 (stale) or memory (new) while this is in
progress, but they will then be invalidated, and from then on can only
fetch the new data.
That assumes that both levels were clean to begin with. If they are not,
then additional maintenance is required. It's also conceivable that
caches could be implemented such that the above is insufficient, YMMV.
Thanks,
Mark.
^ permalink raw reply [flat|nested] 4+ messages in thread
* [U-Boot] armv7 DMA and cache mangement functions
2015-08-27 17:41 ` Mark Rutland
@ 2015-08-28 8:43 ` Markus Niebel
0 siblings, 0 replies; 4+ messages in thread
From: Markus Niebel @ 2015-08-28 8:43 UTC (permalink / raw)
To: u-boot
Am 27.08.2015 um 19:41 schrieb Mark Rutland:
> On Mon, Aug 24, 2015 at 08:54:17AM +0100, Markus Niebel wrote:
>> Hello,
>>
>> I'm not an expert in the low level details of this area. So please sorry if there are
>> wrong assumptions in this post post.
>>
>> Hardware: i.MX6 Solo (TQMa6 on custom Mainboard)
>> U-Boot: 2014.10
>> gcc: 4.8.3
>>
>> We see an error using TFTP on i.MX6 that seems to triggered, if the code / data size goes
>> over a limit. Code changes have nothing to do with network stack, network drivers,
>> memory mangement. TFTP will completely unusable: device sees frequently erroneous packages
>> with different of wierd errors. If code stays below this size all works fine.
>>
>> Up to now we checked a lot of things. The following brought us to the assumption, that this
>> could be cache related:
>>
>> dynamically disable data cache before doing TFTP: TFTP works well again
>> running with disabled L2 cache (data cache enabled): TFTP works well again
>>
>> Looking at the code in drivers/net/fec_mxc.c, function fec_recv we see a call to
>> invalidate_dcache_range before accessing the received ethernet data. When looking at
>> the code for invalidate_dcache_range in arch/arm/cpu/armv7/cache_v7.c an comparing
>> how the things done in linux and barebox we noticed that the order of L2 chache / data cache
>> invalidation is just swapped there. Applying this to the receive code for fec_mxc,
>> TFTP will work again.
>>
>> Question: is the order of cache invalidation important?
>
> The order is important.
>
> Consider the case where both the external and architected caches contain
> stale (but clean) cache lines for the region you care about.
>
> If you invalidate the architected caches before the external L2, the
> architected caches may speculatively fetch (stale) data from the L2
> before the L2 is cleaned, and so in the end you may still see stale
> data in the architected caches.
>
> If you invalidate the L2 first, the architected caches could
> speculatively fetch from the L2 (stale) or memory (new) while this is in
> progress, but they will then be invalidated, and from then on can only
> fetch the new data.
>
> That assumes that both levels were clean to begin with. If they are not,
> then additional maintenance is required. It's also conceivable that
> caches could be implemented such that the above is insufficient, YMMV.
>
Thank you for the clarification. Given your statement the code in question:
arch/arm/cpu/armv7/cache_v7.c
/*
* Invalidates range in all levels of D-cache/unified cache used:
* Affects the range [start, stop - 1]
*/
void invalidate_dcache_range(unsigned long start, unsigned long stop)
{
v7_dcache_maint_range(start, stop, ARMV7_DCACHE_INVAL_RANGE);
v7_outer_cache_inval_range(start, stop);
}
is wrong - and also the flush_dcache_all logic.
Given http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.faqs/ka14802.html
the chache invalidate handler needs to be fixed in a way, that also the platforms providing
atomic maintenance are taken in account.
Correct me if I'm wrong.
Regards
Markus
> Thanks,
> Mark.
>
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2015-08-28 8:43 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-08-24 7:54 [U-Boot] armv7 DMA and cache mangement functions Markus Niebel
2015-08-27 17:19 ` Markus Niebel
2015-08-27 17:41 ` Mark Rutland
2015-08-28 8:43 ` Markus Niebel
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox