ARM caches variants.

linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed

* ARM caches variants.
@ 2010-03-23 12:39 Gilles Chanteperdrix
  2010-03-23 12:53 ` Catalin Marinas
  0 siblings, 1 reply; 16+ messages in thread
From: Gilles Chanteperdrix @ 2010-03-23 12:39 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

I am trying to have a better understanding of the issues with the
various types of caches used on ARM, and I came to a question I feel is
really stupid, but can not find the answer myself.

As I understood, the VIVT cache has the following issues:
- issue #1; two processes may use the same virtual address for different
physical addresses, but will share the same cache line and see memory
corruption if no precaution is taken;
- issue #2; two processes may use different virtual addresses for the
same physical address (shared mapping), but will use different cache
lines, causing all sorts of incoherence if no precaution is taken;
- issue #3; the same process may use different virtual addresses for the
same physical address (same shared mapping mapped several time in the
same process virtual memory), basically almost the same issue as issue #2.

The Linux kernel, solves issue #1 and #2 by flushing the cache at every
context switch, and issue #3 by remapping the multiply mapped shared
mapping in "uncacheable, buffered only" mode if write buffer is
sufficiently well behaved, or "uncacheable, unbuffered" mode if write
buffer is buggy.

Now, if we look at VIPT, aliasing caches:
- the physical tagging solves issue #1 automatically,
- the cache colouring technique used in arch_get_unmapped_area solves
issue #2 and #3 by ensuring that the areas using the same physical
address will end up using the same cache lines, and avoid aliases "by
construction".

VIPT non-aliasing caches have none of the three issues.

First question: did I get it right?
Second question: do issue #1, #2 and #3 have official non-ambiguous names?
Now, the stupid question: why not using the cache colouring technique
used for VIPT caches to solve issue #3 with VIVT caches?

I have thought about several answers:
- there is a technical reason which will make the cache colouring
technique fail with VIVT;
- the reason is purely historical, the two solutions were implemented at
different times;
- the issue #3 is a corner case and the cache colouring technique
induces a potential waste of virtual address space, so the trade off is
not interesting.

Thanks in advance for your answer.

-- 
					    Gilles.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* ARM caches variants.
  2010-03-23 12:39 ARM caches variants Gilles Chanteperdrix
@ 2010-03-23 12:53 ` Catalin Marinas
  2010-03-23 13:15   ` Gilles Chanteperdrix
  0 siblings, 1 reply; 16+ messages in thread
From: Catalin Marinas @ 2010-03-23 12:53 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 2010-03-23 at 12:39 +0000, Gilles Chanteperdrix wrote:
> As I understood, the VIVT cache has the following issues:
> - issue #1; two processes may use the same virtual address for different
> physical addresses, but will share the same cache line and see memory
> corruption if no precaution is taken;
> - issue #2; two processes may use different virtual addresses for the
> same physical address (shared mapping), but will use different cache
> lines, causing all sorts of incoherence if no precaution is taken;
> - issue #3; the same process may use different virtual addresses for the
> same physical address (same shared mapping mapped several time in the
> same process virtual memory), basically almost the same issue as issue #2.
[...]
> Now, if we look at VIPT, aliasing caches:
> - the physical tagging solves issue #1 automatically,
> - the cache colouring technique used in arch_get_unmapped_area solves
> issue #2 and #3 by ensuring that the areas using the same physical
> address will end up using the same cache lines, and avoid aliases "by
> construction".
[...]
> First question: did I get it right?

Yes.

> Second question: do issue #1, #2 and #3 have official non-ambiguous names?

I don't think there are any official names. You could say cache aliasing
though not sure it covers everything.

> Now, the stupid question: why not using the cache colouring technique
> used for VIPT caches to solve issue #3 with VIVT caches?

Because with aliasing VIPT it is guaranteed that if a virtual address
has the same offset in a 16KB block (i.e. the same colour - there are
only 4 colours given by bits 13 and 12 of the virtual address), you get
the same cache line allocated for a given physical address. The tag of a
cache line is given by bits 31..14 of the physical address.

With VIVT, the cache tags are not aware of the physical address, hence
you can have 2^20 colours (bits 31..12 of the virtual address). You
would need to map a physical address at the same virtual address in all
applications sharing it (and you may end up with uClinux :)).

-- 
Catalin

^ permalink raw reply	[flat|nested] 16+ messages in thread

* ARM caches variants.
  2010-03-23 12:53 ` Catalin Marinas
@ 2010-03-23 13:15   ` Gilles Chanteperdrix
  2010-03-23 13:42     ` Catalin Marinas
  0 siblings, 1 reply; 16+ messages in thread
From: Gilles Chanteperdrix @ 2010-03-23 13:15 UTC (permalink / raw)
  To: linux-arm-kernel

Catalin Marinas wrote:
> On Tue, 2010-03-23 at 12:39 +0000, Gilles Chanteperdrix wrote:
>> As I understood, the VIVT cache has the following issues:
>> - issue #1; two processes may use the same virtual address for different
>> physical addresses, but will share the same cache line and see memory
>> corruption if no precaution is taken;
>> - issue #2; two processes may use different virtual addresses for the
>> same physical address (shared mapping), but will use different cache
>> lines, causing all sorts of incoherence if no precaution is taken;
>> - issue #3; the same process may use different virtual addresses for the
>> same physical address (same shared mapping mapped several time in the
>> same process virtual memory), basically almost the same issue as issue #2.
> [...]
>> Now, if we look at VIPT, aliasing caches:
>> - the physical tagging solves issue #1 automatically,
>> - the cache colouring technique used in arch_get_unmapped_area solves
>> issue #2 and #3 by ensuring that the areas using the same physical
>> address will end up using the same cache lines, and avoid aliases "by
>> construction".
> [...]
>> First question: did I get it right?
> 
> Yes.
> 
>> Second question: do issue #1, #2 and #3 have official non-ambiguous names?
> 
> I don't think there are any official names. You could say cache aliasing
> though not sure it covers everything.
> 
>> Now, the stupid question: why not using the cache colouring technique
>> used for VIPT caches to solve issue #3 with VIVT caches?
> 
> Because with aliasing VIPT it is guaranteed that if a virtual address
> has the same offset in a 16KB block (i.e. the same colour - there are
> only 4 colours given by bits 13 and 12 of the virtual address), you get
> the same cache line allocated for a given physical address. The tag of a
> cache line is given by bits 31..14 of the physical address.
> 
> With VIVT, the cache tags are not aware of the physical address, hence
> you can have 2^20 colours (bits 31..12 of the virtual address). You
> would need to map a physical address at the same virtual address in all
> applications sharing it (and you may end up with uClinux :)).

Ok. I do not get it. Let us do it in slow motion: as I understand, the
problem with issue #2 and #3 is not really about the tag, but about two
different virtual addresses ending up using different cache lines,
whatever the tag. By using cache colouring, can not we ensure that they
end up in the same cache line and simply evict each other because they
do not have the same tag?

In other word, is not the cache line used by virtual address addr:
(addr % cache size) / (cache line size)
?

Or is it precisely what armv6 guarantees?

-- 
					    Gilles.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* ARM caches variants.
  2010-03-23 13:15   ` Gilles Chanteperdrix
@ 2010-03-23 13:42     ` Catalin Marinas
  2010-03-23 13:59       ` Gilles Chanteperdrix
  2010-03-23 23:49       ` Jamie Lokier
  0 siblings, 2 replies; 16+ messages in thread
From: Catalin Marinas @ 2010-03-23 13:42 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 2010-03-23 at 13:15 +0000, Gilles Chanteperdrix wrote:
> Catalin Marinas wrote:
> > On Tue, 2010-03-23 at 12:39 +0000, Gilles Chanteperdrix wrote:
> >> Now, the stupid question: why not using the cache colouring technique
> >> used for VIPT caches to solve issue #3 with VIVT caches?
> >
> > Because with aliasing VIPT it is guaranteed that if a virtual address
> > has the same offset in a 16KB block (i.e. the same colour - there are
> > only 4 colours given by bits 13 and 12 of the virtual address), you get
> > the same cache line allocated for a given physical address. The tag of a
> > cache line is given by bits 31..14 of the physical address.
> >
> > With VIVT, the cache tags are not aware of the physical address, hence
> > you can have 2^20 colours (bits 31..12 of the virtual address). You
> > would need to map a physical address at the same virtual address in all
> > applications sharing it (and you may end up with uClinux :)).
> 
> Ok. I do not get it. Let us do it in slow motion: as I understand, the
> problem with issue #2 and #3 is not really about the tag, but about two
> different virtual addresses ending up using different cache lines,
> whatever the tag. By using cache colouring, can not we ensure that they
> end up in the same cache line and simply evict each other because they
> do not have the same tag?
> 
> In other word, is not the cache line used by virtual address addr:
> (addr % cache size) / (cache line size)

With any cache line, you have an index and a tag for identifying it. The
cache may have multiple ways (e.g. 4-way associative) to speed up the
look-up. For a 32KB 4-way associative cache you have 8KB per way (2^13).

If the cache line size is 32B (2^5), the index of a cache line is:

addr & (2^13 - 1) >> 5

e.g. bits 12..5 from the VA are used for indexing the cache line.

The tag is given by the rest of the top bits, in the above case bits
31..13 of the VA (if VIVT cache) or PA (VIPT cache).

The cache look-up for a VA goes something like this:

     1. extracts the index. With a 4-way associative cache there are 4
        possible cache lines for this index
     2. extracts the tag (from either VA or PA, depending on the cache
        type). For VIPT caches, it needs to do a TLB look-up as well to
        find the physical address
     3. check the four cache lines identified by the index at step 1
        against their tag
     4. if the tag matches, you get a hit, otherwise a miss

For your #2 and #3 issues, if two processes map the same PA using
different VAs, data can end up pretty much anywhere in a VIVT cache. If
you calculate the index and tag (used to identify a cache line) for two
different VAs, the only common part are bits 11..5 of the index (since
they are inside a page). If you want to have the same index and tag for
the two different VAs, you end up with having to use the same VA in both
processes.

With VIPT caches, the tag is the same for issues #2 and #3. The only
difference may be in a few top bits of the index. In the above case,
it's bit 12 of the VA which may differ. This gives you two page colours
(with 64KB 4-way associative cache you have 2 bits for the colour
resulting in 4 colours).

-- 
Catalin

^ permalink raw reply	[flat|nested] 16+ messages in thread

* ARM caches variants.
  2010-03-23 13:42     ` Catalin Marinas
@ 2010-03-23 13:59       ` Gilles Chanteperdrix
  2010-03-23 14:33         ` Catalin Marinas
  2010-03-23 23:49       ` Jamie Lokier
  1 sibling, 1 reply; 16+ messages in thread
From: Gilles Chanteperdrix @ 2010-03-23 13:59 UTC (permalink / raw)
  To: linux-arm-kernel

Catalin Marinas wrote:
> On Tue, 2010-03-23 at 13:15 +0000, Gilles Chanteperdrix wrote:
>> Catalin Marinas wrote:
>>> On Tue, 2010-03-23 at 12:39 +0000, Gilles Chanteperdrix wrote:
>>>> Now, the stupid question: why not using the cache colouring technique
>>>> used for VIPT caches to solve issue #3 with VIVT caches?
>>> Because with aliasing VIPT it is guaranteed that if a virtual address
>>> has the same offset in a 16KB block (i.e. the same colour - there are
>>> only 4 colours given by bits 13 and 12 of the virtual address), you get
>>> the same cache line allocated for a given physical address. The tag of a
>>> cache line is given by bits 31..14 of the physical address.
>>>
>>> With VIVT, the cache tags are not aware of the physical address, hence
>>> you can have 2^20 colours (bits 31..12 of the virtual address). You
>>> would need to map a physical address at the same virtual address in all
>>> applications sharing it (and you may end up with uClinux :)).
>> Ok. I do not get it. Let us do it in slow motion: as I understand, the
>> problem with issue #2 and #3 is not really about the tag, but about two
>> different virtual addresses ending up using different cache lines,
>> whatever the tag. By using cache colouring, can not we ensure that they
>> end up in the same cache line and simply evict each other because they
>> do not have the same tag?
>>
>> In other word, is not the cache line used by virtual address addr:
>> (addr % cache size) / (cache line size)
> 
> With any cache line, you have an index and a tag for identifying it. The
> cache may have multiple ways (e.g. 4-way associative) to speed up the
> look-up. For a 32KB 4-way associative cache you have 8KB per way (2^13).
> 
> If the cache line size is 32B (2^5), the index of a cache line is:
> 
> addr & (2^13 - 1) >> 5
> 
> e.g. bits 12..5 from the VA are used for indexing the cache line.
> 
> The tag is given by the rest of the top bits, in the above case bits
> 31..13 of the VA (if VIVT cache) or PA (VIPT cache).
> 
> The cache look-up for a VA goes something like this:
> 
>      1. extracts the index. With a 4-way associative cache there are 4
>         possible cache lines for this index
>      2. extracts the tag (from either VA or PA, depending on the cache
>         type). For VIPT caches, it needs to do a TLB look-up as well to
>         find the physical address
>      3. check the four cache lines identified by the index at step 1
>         against their tag
>      4. if the tag matches, you get a hit, otherwise a miss
> 
> For your #2 and #3 issues, if two processes map the same PA using
> different VAs, data can end up pretty much anywhere in a VIVT cache. If
> you calculate the index and tag (used to identify a cache line) for two
> different VAs, the only common part are bits 11..5 of the index (since
> they are inside a page). If you want to have the same index and tag for
> the two different VAs, you end up with having to use the same VA in both
> processes.
> 
> With VIPT caches, the tag is the same for issues #2 and #3. The only
> difference may be in a few top bits of the index. In the above case,
> it's bit 12 of the VA which may differ. This gives you two page colours
> (with 64KB 4-way associative cache you have 2 bits for the colour
> resulting in 4 colours).
> 

Thanks for the explanation, I need to read your e-mail in detail to
understand it fully. It seemed to me that having the same index was
enough to solve issues #2 and #3, and that it was possible by using
cache coulouring, but as I understand, the fact that a cache can have
multiple ways means that the same index can index several cache lines.
This is exactly the information I was looking for.

-- 
					    Gilles.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* ARM caches variants.
  2010-03-23 13:59       ` Gilles Chanteperdrix
@ 2010-03-23 14:33         ` Catalin Marinas
  2010-03-23 14:39           ` Gilles Chanteperdrix
  0 siblings, 1 reply; 16+ messages in thread
From: Catalin Marinas @ 2010-03-23 14:33 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 2010-03-23 at 13:59 +0000, Gilles Chanteperdrix wrote:
> Catalin Marinas wrote:
> > On Tue, 2010-03-23 at 13:15 +0000, Gilles Chanteperdrix wrote:
> >> Catalin Marinas wrote:
> >>> On Tue, 2010-03-23 at 12:39 +0000, Gilles Chanteperdrix wrote:
> >>>> Now, the stupid question: why not using the cache colouring technique
> >>>> used for VIPT caches to solve issue #3 with VIVT caches?
> >>> Because with aliasing VIPT it is guaranteed that if a virtual address
> >>> has the same offset in a 16KB block (i.e. the same colour - there are
> >>> only 4 colours given by bits 13 and 12 of the virtual address), you get
> >>> the same cache line allocated for a given physical address. The tag of a
> >>> cache line is given by bits 31..14 of the physical address.
> >>>
> >>> With VIVT, the cache tags are not aware of the physical address, hence
> >>> you can have 2^20 colours (bits 31..12 of the virtual address). You
> >>> would need to map a physical address at the same virtual address in all
> >>> applications sharing it (and you may end up with uClinux :)).
> >> Ok. I do not get it. Let us do it in slow motion: as I understand, the
> >> problem with issue #2 and #3 is not really about the tag, but about two
> >> different virtual addresses ending up using different cache lines,
> >> whatever the tag. By using cache colouring, can not we ensure that they
> >> end up in the same cache line and simply evict each other because they
> >> do not have the same tag?
> >>
> >> In other word, is not the cache line used by virtual address addr:
> >> (addr % cache size) / (cache line size)
> >
> > With any cache line, you have an index and a tag for identifying it. The
> > cache may have multiple ways (e.g. 4-way associative) to speed up the
> > look-up. For a 32KB 4-way associative cache you have 8KB per way (2^13).
> >
> > If the cache line size is 32B (2^5), the index of a cache line is:
> >
> > addr & (2^13 - 1) >> 5
> >
> > e.g. bits 12..5 from the VA are used for indexing the cache line.
> >
> > The tag is given by the rest of the top bits, in the above case bits
> > 31..13 of the VA (if VIVT cache) or PA (VIPT cache).
> >
> > The cache look-up for a VA goes something like this:
> >
> >      1. extracts the index. With a 4-way associative cache there are 4
> >         possible cache lines for this index
> >      2. extracts the tag (from either VA or PA, depending on the cache
> >         type). For VIPT caches, it needs to do a TLB look-up as well to
> >         find the physical address
> >      3. check the four cache lines identified by the index at step 1
> >         against their tag
> >      4. if the tag matches, you get a hit, otherwise a miss
> >
> > For your #2 and #3 issues, if two processes map the same PA using
> > different VAs, data can end up pretty much anywhere in a VIVT cache. If
> > you calculate the index and tag (used to identify a cache line) for two
> > different VAs, the only common part are bits 11..5 of the index (since
> > they are inside a page). If you want to have the same index and tag for
> > the two different VAs, you end up with having to use the same VA in both
> > processes.
> >
> > With VIPT caches, the tag is the same for issues #2 and #3. The only
> > difference may be in a few top bits of the index. In the above case,
> > it's bit 12 of the VA which may differ. This gives you two page colours
> > (with 64KB 4-way associative cache you have 2 bits for the colour
> > resulting in 4 colours).
> >
> 
> Thanks for the explanation, I need to read your e-mail in detail to
> understand it fully. It seemed to me that having the same index was
> enough to solve issues #2 and #3, and that it was possible by using
> cache coulouring, but as I understand, the fact that a cache can have
> multiple ways means that the same index can index several cache lines.

Even if you have a 1-way associative cache (some processors allow the
disabling of the other 3 ways if you want to try), the tag stored with
the cache line is different between different VAs on a VIVT cache.

So with two different VAs mapping the same PA, if a VA0 access allocates
the cache line and VA1 would find the same cache line via the index
calculation, it would get a cache miss because the tags for VA0 and VA1
do not match.

-- 
Catalin

^ permalink raw reply	[flat|nested] 16+ messages in thread

* ARM caches variants.
  2010-03-23 14:33         ` Catalin Marinas
@ 2010-03-23 14:39           ` Gilles Chanteperdrix
  2010-03-23 14:46             ` Catalin Marinas
  0 siblings, 1 reply; 16+ messages in thread
From: Gilles Chanteperdrix @ 2010-03-23 14:39 UTC (permalink / raw)
  To: linux-arm-kernel

Catalin Marinas wrote:
> On Tue, 2010-03-23 at 13:59 +0000, Gilles Chanteperdrix wrote:
>> Catalin Marinas wrote:
>>> On Tue, 2010-03-23 at 13:15 +0000, Gilles Chanteperdrix wrote:
>>>> Catalin Marinas wrote:
>>>>> On Tue, 2010-03-23 at 12:39 +0000, Gilles Chanteperdrix wrote:
>>>>>> Now, the stupid question: why not using the cache colouring technique
>>>>>> used for VIPT caches to solve issue #3 with VIVT caches?
>>>>> Because with aliasing VIPT it is guaranteed that if a virtual address
>>>>> has the same offset in a 16KB block (i.e. the same colour - there are
>>>>> only 4 colours given by bits 13 and 12 of the virtual address), you get
>>>>> the same cache line allocated for a given physical address. The tag of a
>>>>> cache line is given by bits 31..14 of the physical address.
>>>>>
>>>>> With VIVT, the cache tags are not aware of the physical address, hence
>>>>> you can have 2^20 colours (bits 31..12 of the virtual address). You
>>>>> would need to map a physical address at the same virtual address in all
>>>>> applications sharing it (and you may end up with uClinux :)).
>>>> Ok. I do not get it. Let us do it in slow motion: as I understand, the
>>>> problem with issue #2 and #3 is not really about the tag, but about two
>>>> different virtual addresses ending up using different cache lines,
>>>> whatever the tag. By using cache colouring, can not we ensure that they
>>>> end up in the same cache line and simply evict each other because they
>>>> do not have the same tag?
>>>>
>>>> In other word, is not the cache line used by virtual address addr:
>>>> (addr % cache size) / (cache line size)
>>> With any cache line, you have an index and a tag for identifying it. The
>>> cache may have multiple ways (e.g. 4-way associative) to speed up the
>>> look-up. For a 32KB 4-way associative cache you have 8KB per way (2^13).
>>>
>>> If the cache line size is 32B (2^5), the index of a cache line is:
>>>
>>> addr & (2^13 - 1) >> 5
>>>
>>> e.g. bits 12..5 from the VA are used for indexing the cache line.
>>>
>>> The tag is given by the rest of the top bits, in the above case bits
>>> 31..13 of the VA (if VIVT cache) or PA (VIPT cache).
>>>
>>> The cache look-up for a VA goes something like this:
>>>
>>>      1. extracts the index. With a 4-way associative cache there are 4
>>>         possible cache lines for this index
>>>      2. extracts the tag (from either VA or PA, depending on the cache
>>>         type). For VIPT caches, it needs to do a TLB look-up as well to
>>>         find the physical address
>>>      3. check the four cache lines identified by the index at step 1
>>>         against their tag
>>>      4. if the tag matches, you get a hit, otherwise a miss
>>>
>>> For your #2 and #3 issues, if two processes map the same PA using
>>> different VAs, data can end up pretty much anywhere in a VIVT cache. If
>>> you calculate the index and tag (used to identify a cache line) for two
>>> different VAs, the only common part are bits 11..5 of the index (since
>>> they are inside a page). If you want to have the same index and tag for
>>> the two different VAs, you end up with having to use the same VA in both
>>> processes.
>>>
>>> With VIPT caches, the tag is the same for issues #2 and #3. The only
>>> difference may be in a few top bits of the index. In the above case,
>>> it's bit 12 of the VA which may differ. This gives you two page colours
>>> (with 64KB 4-way associative cache you have 2 bits for the colour
>>> resulting in 4 colours).
>>>
>> Thanks for the explanation, I need to read your e-mail in detail to
>> understand it fully. It seemed to me that having the same index was
>> enough to solve issues #2 and #3, and that it was possible by using
>> cache coulouring, but as I understand, the fact that a cache can have
>> multiple ways means that the same index can index several cache lines.
> 
> Even if you have a 1-way associative cache (some processors allow the
> disabling of the other 3 ways if you want to try), the tag stored with
> the cache line is different between different VAs on a VIVT cache.
> 
> So with two different VAs mapping the same PA, if a VA0 access allocates
> the cache line and VA1 would find the same cache line via the index
> calculation, it would get a cache miss because the tags for VA0 and VA1
> do not match.

But if we assume that it evicts the contents of VA0 and allocates the
cache for VA1 when VA1 is accessed, the system would just work. Or am I
still assuming too much about the inner workings of the cache and there
are cases where the cache could decide to keep the cache line allocated
for VA0 and simply make a RAM access to fulfil the access through VA1?


-- 
					    Gilles.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* ARM caches variants.
  2010-03-23 14:39           ` Gilles Chanteperdrix
@ 2010-03-23 14:46             ` Catalin Marinas
  2010-03-23 14:47               ` Catalin Marinas
                                 ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Catalin Marinas @ 2010-03-23 14:46 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 2010-03-23 at 14:39 +0000, Gilles Chanteperdrix wrote:
> Catalin Marinas wrote:
> > On Tue, 2010-03-23 at 13:59 +0000, Gilles Chanteperdrix wrote:
> >> Catalin Marinas wrote:
> >>> On Tue, 2010-03-23 at 13:15 +0000, Gilles Chanteperdrix wrote:
> >>>> Catalin Marinas wrote:
> >>>>> On Tue, 2010-03-23 at 12:39 +0000, Gilles Chanteperdrix wrote:
> >>>>>> Now, the stupid question: why not using the cache colouring technique
> >>>>>> used for VIPT caches to solve issue #3 with VIVT caches?
> >>>>> Because with aliasing VIPT it is guaranteed that if a virtual address
> >>>>> has the same offset in a 16KB block (i.e. the same colour - there are
> >>>>> only 4 colours given by bits 13 and 12 of the virtual address), you get
> >>>>> the same cache line allocated for a given physical address. The tag of a
> >>>>> cache line is given by bits 31..14 of the physical address.
> >>>>>
> >>>>> With VIVT, the cache tags are not aware of the physical address, hence
> >>>>> you can have 2^20 colours (bits 31..12 of the virtual address). You
> >>>>> would need to map a physical address at the same virtual address in all
> >>>>> applications sharing it (and you may end up with uClinux :)).
> >>>> Ok. I do not get it. Let us do it in slow motion: as I understand, the
> >>>> problem with issue #2 and #3 is not really about the tag, but about two
> >>>> different virtual addresses ending up using different cache lines,
> >>>> whatever the tag. By using cache colouring, can not we ensure that they
> >>>> end up in the same cache line and simply evict each other because they
> >>>> do not have the same tag?
> >>>>
> >>>> In other word, is not the cache line used by virtual address addr:
> >>>> (addr % cache size) / (cache line size)
> >>> With any cache line, you have an index and a tag for identifying it. The
> >>> cache may have multiple ways (e.g. 4-way associative) to speed up the
> >>> look-up. For a 32KB 4-way associative cache you have 8KB per way (2^13).
> >>>
> >>> If the cache line size is 32B (2^5), the index of a cache line is:
> >>>
> >>> addr & (2^13 - 1) >> 5
> >>>
> >>> e.g. bits 12..5 from the VA are used for indexing the cache line.
> >>>
> >>> The tag is given by the rest of the top bits, in the above case bits
> >>> 31..13 of the VA (if VIVT cache) or PA (VIPT cache).
> >>>
> >>> The cache look-up for a VA goes something like this:
> >>>
> >>>      1. extracts the index. With a 4-way associative cache there are 4
> >>>         possible cache lines for this index
> >>>      2. extracts the tag (from either VA or PA, depending on the cache
> >>>         type). For VIPT caches, it needs to do a TLB look-up as well to
> >>>         find the physical address
> >>>      3. check the four cache lines identified by the index at step 1
> >>>         against their tag
> >>>      4. if the tag matches, you get a hit, otherwise a miss
> >>>
> >>> For your #2 and #3 issues, if two processes map the same PA using
> >>> different VAs, data can end up pretty much anywhere in a VIVT cache. If
> >>> you calculate the index and tag (used to identify a cache line) for two
> >>> different VAs, the only common part are bits 11..5 of the index (since
> >>> they are inside a page). If you want to have the same index and tag for
> >>> the two different VAs, you end up with having to use the same VA in both
> >>> processes.
> >>>
> >>> With VIPT caches, the tag is the same for issues #2 and #3. The only
> >>> difference may be in a few top bits of the index. In the above case,
> >>> it's bit 12 of the VA which may differ. This gives you two page colours
> >>> (with 64KB 4-way associative cache you have 2 bits for the colour
> >>> resulting in 4 colours).
> >>>
> >> Thanks for the explanation, I need to read your e-mail in detail to
> >> understand it fully. It seemed to me that having the same index was
> >> enough to solve issues #2 and #3, and that it was possible by using
> >> cache coulouring, but as I understand, the fact that a cache can have
> >> multiple ways means that the same index can index several cache lines.
> >
> > Even if you have a 1-way associative cache (some processors allow the
> > disabling of the other 3 ways if you want to try), the tag stored with
> > the cache line is different between different VAs on a VIVT cache.
> >
> > So with two different VAs mapping the same PA, if a VA0 access allocates
> > the cache line and VA1 would find the same cache line via the index
> > calculation, it would get a cache miss because the tags for VA0 and VA1
> > do not match.
> 
> But if we assume that it evicts the contents of VA0 and allocates the
> cache for VA1 when VA1 is accessed, the system would just work. 

That's correct, for this particular case it should work (though I think
fully associative caches are not that common).

But what about same VA pointing to different PAs in separate processes
(issue #4)?

-- 
Catalin

^ permalink raw reply	[flat|nested] 16+ messages in thread

* ARM caches variants.
  2010-03-23 14:46             ` Catalin Marinas
@ 2010-03-23 14:47               ` Catalin Marinas
  2010-03-23 14:49               ` Gilles Chanteperdrix
  2010-03-23 23:39               ` Jamie Lokier
  2 siblings, 0 replies; 16+ messages in thread
From: Catalin Marinas @ 2010-03-23 14:47 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 2010-03-23 at 14:46 +0000, Catalin Marinas wrote:
> On Tue, 2010-03-23 at 14:39 +0000, Gilles Chanteperdrix wrote:
> > Catalin Marinas wrote:
> > > On Tue, 2010-03-23 at 13:59 +0000, Gilles Chanteperdrix wrote:
> > >> Catalin Marinas wrote:
> > >>> On Tue, 2010-03-23 at 13:15 +0000, Gilles Chanteperdrix wrote:
> > >>>> Catalin Marinas wrote:
> > >>>>> On Tue, 2010-03-23 at 12:39 +0000, Gilles Chanteperdrix wrote:
> > >>>>>> Now, the stupid question: why not using the cache colouring technique
> > >>>>>> used for VIPT caches to solve issue #3 with VIVT caches?
> > >>>>> Because with aliasing VIPT it is guaranteed that if a virtual address
> > >>>>> has the same offset in a 16KB block (i.e. the same colour - there are
> > >>>>> only 4 colours given by bits 13 and 12 of the virtual address), you get
> > >>>>> the same cache line allocated for a given physical address. The tag of a
> > >>>>> cache line is given by bits 31..14 of the physical address.
> > >>>>>
> > >>>>> With VIVT, the cache tags are not aware of the physical address, hence
> > >>>>> you can have 2^20 colours (bits 31..12 of the virtual address). You
> > >>>>> would need to map a physical address at the same virtual address in all
> > >>>>> applications sharing it (and you may end up with uClinux :)).
> > >>>> Ok. I do not get it. Let us do it in slow motion: as I understand, the
> > >>>> problem with issue #2 and #3 is not really about the tag, but about two
> > >>>> different virtual addresses ending up using different cache lines,
> > >>>> whatever the tag. By using cache colouring, can not we ensure that they
> > >>>> end up in the same cache line and simply evict each other because they
> > >>>> do not have the same tag?
> > >>>>
> > >>>> In other word, is not the cache line used by virtual address addr:
> > >>>> (addr % cache size) / (cache line size)
> > >>> With any cache line, you have an index and a tag for identifying it. The
> > >>> cache may have multiple ways (e.g. 4-way associative) to speed up the
> > >>> look-up. For a 32KB 4-way associative cache you have 8KB per way (2^13).
> > >>>
> > >>> If the cache line size is 32B (2^5), the index of a cache line is:
> > >>>
> > >>> addr & (2^13 - 1) >> 5
> > >>>
> > >>> e.g. bits 12..5 from the VA are used for indexing the cache line.
> > >>>
> > >>> The tag is given by the rest of the top bits, in the above case bits
> > >>> 31..13 of the VA (if VIVT cache) or PA (VIPT cache).
> > >>>
> > >>> The cache look-up for a VA goes something like this:
> > >>>
> > >>>      1. extracts the index. With a 4-way associative cache there are 4
> > >>>         possible cache lines for this index
> > >>>      2. extracts the tag (from either VA or PA, depending on the cache
> > >>>         type). For VIPT caches, it needs to do a TLB look-up as well to
> > >>>         find the physical address
> > >>>      3. check the four cache lines identified by the index at step 1
> > >>>         against their tag
> > >>>      4. if the tag matches, you get a hit, otherwise a miss
> > >>>
> > >>> For your #2 and #3 issues, if two processes map the same PA using
> > >>> different VAs, data can end up pretty much anywhere in a VIVT cache. If
> > >>> you calculate the index and tag (used to identify a cache line) for two
> > >>> different VAs, the only common part are bits 11..5 of the index (since
> > >>> they are inside a page). If you want to have the same index and tag for
> > >>> the two different VAs, you end up with having to use the same VA in both
> > >>> processes.
> > >>>
> > >>> With VIPT caches, the tag is the same for issues #2 and #3. The only
> > >>> difference may be in a few top bits of the index. In the above case,
> > >>> it's bit 12 of the VA which may differ. This gives you two page colours
> > >>> (with 64KB 4-way associative cache you have 2 bits for the colour
> > >>> resulting in 4 colours).
> > >>>
> > >> Thanks for the explanation, I need to read your e-mail in detail to
> > >> understand it fully. It seemed to me that having the same index was
> > >> enough to solve issues #2 and #3, and that it was possible by using
> > >> cache coulouring, but as I understand, the fact that a cache can have
> > >> multiple ways means that the same index can index several cache lines.
> > >
> > > Even if you have a 1-way associative cache (some processors allow the
> > > disabling of the other 3 ways if you want to try), the tag stored with
> > > the cache line is different between different VAs on a VIVT cache.
> > >
> > > So with two different VAs mapping the same PA, if a VA0 access allocates
> > > the cache line and VA1 would find the same cache line via the index
> > > calculation, it would get a cache miss because the tags for VA0 and VA1
> > > do not match.
> > 
> > But if we assume that it evicts the contents of VA0 and allocates the
> > cache for VA1 when VA1 is accessed, the system would just work. 
> 
> That's correct, for this particular case it should work (though I think
> fully associative caches are not that common).
> 
> But what about same VA pointing to different PAs in separate processes
> (issue #4)?

I meant issue #1.

-- 
Catalin

^ permalink raw reply	[flat|nested] 16+ messages in thread

* ARM caches variants.
  2010-03-23 14:46             ` Catalin Marinas
  2010-03-23 14:47               ` Catalin Marinas
@ 2010-03-23 14:49               ` Gilles Chanteperdrix
  2010-03-23 23:39               ` Jamie Lokier
  2 siblings, 0 replies; 16+ messages in thread
From: Gilles Chanteperdrix @ 2010-03-23 14:49 UTC (permalink / raw)
  To: linux-arm-kernel

Catalin Marinas wrote:
> On Tue, 2010-03-23 at 14:39 +0000, Gilles Chanteperdrix wrote:
>> Catalin Marinas wrote:
>>> On Tue, 2010-03-23 at 13:59 +0000, Gilles Chanteperdrix wrote:
>>>> Catalin Marinas wrote:
>>>>> On Tue, 2010-03-23 at 13:15 +0000, Gilles Chanteperdrix wrote:
>>>>>> Catalin Marinas wrote:
>>>>>>> On Tue, 2010-03-23 at 12:39 +0000, Gilles Chanteperdrix wrote:
>>>>>>>> Now, the stupid question: why not using the cache colouring technique
>>>>>>>> used for VIPT caches to solve issue #3 with VIVT caches?
>>>>>>> Because with aliasing VIPT it is guaranteed that if a virtual address
>>>>>>> has the same offset in a 16KB block (i.e. the same colour - there are
>>>>>>> only 4 colours given by bits 13 and 12 of the virtual address), you get
>>>>>>> the same cache line allocated for a given physical address. The tag of a
>>>>>>> cache line is given by bits 31..14 of the physical address.
>>>>>>>
>>>>>>> With VIVT, the cache tags are not aware of the physical address, hence
>>>>>>> you can have 2^20 colours (bits 31..12 of the virtual address). You
>>>>>>> would need to map a physical address at the same virtual address in all
>>>>>>> applications sharing it (and you may end up with uClinux :)).
>>>>>> Ok. I do not get it. Let us do it in slow motion: as I understand, the
>>>>>> problem with issue #2 and #3 is not really about the tag, but about two
>>>>>> different virtual addresses ending up using different cache lines,
>>>>>> whatever the tag. By using cache colouring, can not we ensure that they
>>>>>> end up in the same cache line and simply evict each other because they
>>>>>> do not have the same tag?
>>>>>>
>>>>>> In other word, is not the cache line used by virtual address addr:
>>>>>> (addr % cache size) / (cache line size)
>>>>> With any cache line, you have an index and a tag for identifying it. The
>>>>> cache may have multiple ways (e.g. 4-way associative) to speed up the
>>>>> look-up. For a 32KB 4-way associative cache you have 8KB per way (2^13).
>>>>>
>>>>> If the cache line size is 32B (2^5), the index of a cache line is:
>>>>>
>>>>> addr & (2^13 - 1) >> 5
>>>>>
>>>>> e.g. bits 12..5 from the VA are used for indexing the cache line.
>>>>>
>>>>> The tag is given by the rest of the top bits, in the above case bits
>>>>> 31..13 of the VA (if VIVT cache) or PA (VIPT cache).
>>>>>
>>>>> The cache look-up for a VA goes something like this:
>>>>>
>>>>>      1. extracts the index. With a 4-way associative cache there are 4
>>>>>         possible cache lines for this index
>>>>>      2. extracts the tag (from either VA or PA, depending on the cache
>>>>>         type). For VIPT caches, it needs to do a TLB look-up as well to
>>>>>         find the physical address
>>>>>      3. check the four cache lines identified by the index at step 1
>>>>>         against their tag
>>>>>      4. if the tag matches, you get a hit, otherwise a miss
>>>>>
>>>>> For your #2 and #3 issues, if two processes map the same PA using
>>>>> different VAs, data can end up pretty much anywhere in a VIVT cache. If
>>>>> you calculate the index and tag (used to identify a cache line) for two
>>>>> different VAs, the only common part are bits 11..5 of the index (since
>>>>> they are inside a page). If you want to have the same index and tag for
>>>>> the two different VAs, you end up with having to use the same VA in both
>>>>> processes.
>>>>>
>>>>> With VIPT caches, the tag is the same for issues #2 and #3. The only
>>>>> difference may be in a few top bits of the index. In the above case,
>>>>> it's bit 12 of the VA which may differ. This gives you two page colours
>>>>> (with 64KB 4-way associative cache you have 2 bits for the colour
>>>>> resulting in 4 colours).
>>>>>
>>>> Thanks for the explanation, I need to read your e-mail in detail to
>>>> understand it fully. It seemed to me that having the same index was
>>>> enough to solve issues #2 and #3, and that it was possible by using
>>>> cache coulouring, but as I understand, the fact that a cache can have
>>>> multiple ways means that the same index can index several cache lines.
>>> Even if you have a 1-way associative cache (some processors allow the
>>> disabling of the other 3 ways if you want to try), the tag stored with
>>> the cache line is different between different VAs on a VIVT cache.
>>>
>>> So with two different VAs mapping the same PA, if a VA0 access allocates
>>> the cache line and VA1 would find the same cache line via the index
>>> calculation, it would get a cache miss because the tags for VA0 and VA1
>>> do not match.
>> But if we assume that it evicts the contents of VA0 and allocates the
>> cache for VA1 when VA1 is accessed, the system would just work. 
> 
> That's correct, for this particular case it should work (though I think
> fully associative caches are not that common).
> 
> But what about same VA pointing to different PAs in separate processes
> (issue #4)?

My original question was about solving issue #3 with cache coulouring
while issue #1 and issue #2 are solved by the cache flush during context
switch.

But a corollary was to solve #2 and #3 with cache colouring while #1 is
solved by FCSE.

-- 
					    Gilles.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* ARM caches variants.
  2010-03-23 14:46             ` Catalin Marinas
  2010-03-23 14:47               ` Catalin Marinas
  2010-03-23 14:49               ` Gilles Chanteperdrix
@ 2010-03-23 23:39               ` Jamie Lokier
  2010-03-24  9:33                 ` Catalin Marinas
  2 siblings, 1 reply; 16+ messages in thread
From: Jamie Lokier @ 2010-03-23 23:39 UTC (permalink / raw)
  To: linux-arm-kernel

Catalin Marinas wrote:
> > > Even if you have a 1-way associative cache (some processors allow the
> > > disabling of the other 3 ways if you want to try), the tag stored with
> > > the cache line is different between different VAs on a VIVT cache.
> > >
> > > So with two different VAs mapping the same PA, if a VA0 access allocates
> > > the cache line and VA1 would find the same cache line via the index
> > > calculation, it would get a cache miss because the tags for VA0 and VA1
> > > do not match.
> > 
> > But if we assume that it evicts the contents of VA0 and allocates the
> > cache for VA1 when VA1 is accessed, the system would just work. 
> 
> That's correct, for this particular case it should work (though I think
> fully associative caches are not that common).

I think you might have meant 1-way caches, or running then with 1-way,
is never done.  But if you did have 1-way, then it might work. :-)

With 2-way or more, all bets are off because you don't know which way
will be evicted.

-- Jamie

^ permalink raw reply	[flat|nested] 16+ messages in thread

* ARM caches variants.
  2010-03-23 23:39               ` Jamie Lokier
@ 2010-03-24  9:33                 ` Catalin Marinas
  0 siblings, 0 replies; 16+ messages in thread
From: Catalin Marinas @ 2010-03-24  9:33 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 2010-03-23 at 23:39 +0000, Jamie Lokier wrote:
> Catalin Marinas wrote:
> > > > Even if you have a 1-way associative cache (some processors allow the
> > > > disabling of the other 3 ways if you want to try), the tag stored with
> > > > the cache line is different between different VAs on a VIVT cache.
> > > >
> > > > So with two different VAs mapping the same PA, if a VA0 access allocates
> > > > the cache line and VA1 would find the same cache line via the index
> > > > calculation, it would get a cache miss because the tags for VA0 and VA1
> > > > do not match.
> > >
> > > But if we assume that it evicts the contents of VA0 and allocates the
> > > cache for VA1 when VA1 is accessed, the system would just work.
> >
> > That's correct, for this particular case it should work (though I think
> > fully associative caches are not that common).
> 
> I think you might have meant 1-way caches

Yes, you are right - 1-way  caches (I think they are also called
directly mapped caches, not sure).

-- 
Catalin

^ permalink raw reply	[flat|nested] 16+ messages in thread

* ARM caches variants.
  2010-03-23 13:42     ` Catalin Marinas
  2010-03-23 13:59       ` Gilles Chanteperdrix
@ 2010-03-23 23:49       ` Jamie Lokier
  2010-03-24  9:42         ` Catalin Marinas
  1 sibling, 1 reply; 16+ messages in thread
From: Jamie Lokier @ 2010-03-23 23:49 UTC (permalink / raw)
  To: linux-arm-kernel

Catalin Marinas wrote:
> > In other word, is not the cache line used by virtual address addr:
> > (addr % cache size) / (cache line size)
> 
> With any cache line, you have an index and a tag for identifying it. The
> cache may have multiple ways (e.g. 4-way associative) to speed up the
> look-up. For a 32KB 4-way associative cache you have 8KB per way (2^13).
> 
> If the cache line size is 32B (2^5), the index of a cache line is:
> 
> addr & (2^13 - 1) >> 5
> 
> e.g. bits 12..5 from the VA are used for indexing the cache line.
> 
> The tag is given by the rest of the top bits, in the above case bits
> 31..13 of the VA (if VIVT cache) or PA (VIPT cache).
> 
> The cache look-up for a VA goes something like this:
> 
>      1. extracts the index. With a 4-way associative cache there are 4
>         possible cache lines for this index
>      2. extracts the tag (from either VA or PA, depending on the cache
>         type). For VIPT caches, it needs to do a TLB look-up as well to
>         find the physical address
>      3. check the four cache lines identified by the index at step 1
>         against their tag
>      4. if the tag matches, you get a hit, otherwise a miss
> 
> For your #2 and #3 issues, if two processes map the same PA using
> different VAs, data can end up pretty much anywhere in a VIVT cache. If
> you calculate the index and tag (used to identify a cache line) for two
> different VAs, the only common part are bits 11..5 of the index (since
> they are inside a page). If you want to have the same index and tag for
> the two different VAs, you end up with having to use the same VA in both
> processes.
> 
> With VIPT caches, the tag is the same for issues #2 and #3. The only
> difference may be in a few top bits of the index. In the above case,
> it's bit 12 of the VA which may differ. This gives you two page colours
> (with 64KB 4-way associative cache you have 2 bits for the colour
> resulting in 4 colours).

That's a very helpful explanation, thank you.

Am I to understand that "VIPT aliasing" means there are some of those
bits and therefore >= 2 colours, and "VIPT non-aliasing" means the
cache size / ways is <= PAGE_SIZE, and therefore has effectively 1 colour?

Or does "non-aliasing" mean something else?

I suspect some x86s have VIPT caches, especially AMD (I've seen timing
measurements which clearly show page colour effects), and I can only
imagine that aliasing is prevent by when a cache line requests to be
filled from higher level cache (L2), something very similar to SMP
MESI cache coherence gets involved to keep both lines consistent.

That would make a "VIPT non-aliasing" cache that has multiple colours.
Is that ever done on the ARM architecture?

Thanks again,
-- Jamie

^ permalink raw reply	[flat|nested] 16+ messages in thread

* ARM caches variants.
  2010-03-23 23:49       ` Jamie Lokier
@ 2010-03-24  9:42         ` Catalin Marinas
  2010-03-26  5:45           ` Jamie Lokier
  0 siblings, 1 reply; 16+ messages in thread
From: Catalin Marinas @ 2010-03-24  9:42 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 2010-03-23 at 23:49 +0000, Jamie Lokier wrote:
> Catalin Marinas wrote:
> > > In other word, is not the cache line used by virtual address addr:
> > > (addr % cache size) / (cache line size)
> >
> > With any cache line, you have an index and a tag for identifying it. The
> > cache may have multiple ways (e.g. 4-way associative) to speed up the
> > look-up. For a 32KB 4-way associative cache you have 8KB per way (2^13).
> >
> > If the cache line size is 32B (2^5), the index of a cache line is:
> >
> > addr & (2^13 - 1) >> 5
> >
> > e.g. bits 12..5 from the VA are used for indexing the cache line.
> >
> > The tag is given by the rest of the top bits, in the above case bits
> > 31..13 of the VA (if VIVT cache) or PA (VIPT cache).
> >
> > The cache look-up for a VA goes something like this:
> >
> >      1. extracts the index. With a 4-way associative cache there are 4
> >         possible cache lines for this index
> >      2. extracts the tag (from either VA or PA, depending on the cache
> >         type). For VIPT caches, it needs to do a TLB look-up as well to
> >         find the physical address
> >      3. check the four cache lines identified by the index at step 1
> >         against their tag
> >      4. if the tag matches, you get a hit, otherwise a miss
> >
> > For your #2 and #3 issues, if two processes map the same PA using
> > different VAs, data can end up pretty much anywhere in a VIVT cache. If
> > you calculate the index and tag (used to identify a cache line) for two
> > different VAs, the only common part are bits 11..5 of the index (since
> > they are inside a page). If you want to have the same index and tag for
> > the two different VAs, you end up with having to use the same VA in both
> > processes.
> >
> > With VIPT caches, the tag is the same for issues #2 and #3. The only
> > difference may be in a few top bits of the index. In the above case,
> > it's bit 12 of the VA which may differ. This gives you two page colours
> > (with 64KB 4-way associative cache you have 2 bits for the colour
> > resulting in 4 colours).
> 
> That's a very helpful explanation, thank you.
> 
> Am I to understand that "VIPT aliasing" means there are some of those
> bits and therefore >= 2 colours, and "VIPT non-aliasing" means the
> cache size / ways is <= PAGE_SIZE, and therefore has effectively 1 colour?

A method to get non-aliasing VIPT is to have the way size <= PAGE_SIZE.
That's how ARM1136 with 16K caches works. But with bigger caches, adding
more ways may get expensive in hardware.

> I suspect some x86s have VIPT caches, especially AMD (I've seen timing
> measurements which clearly show page colour effects), and I can only
> imagine that aliasing is prevent by when a cache line requests to be
> filled from higher level cache (L2), something very similar to SMP
> MESI cache coherence gets involved to keep both lines consistent.
> 
> That would make a "VIPT non-aliasing" cache that has multiple colours.
> Is that ever done on the ARM architecture?

ARMv7 has non-aliasing VIPT D-cache where the aliasing is handled by the
hardware (maybe similar to MESI). I don't know the hardware
implementation but my guess is that a cache look-up checks all the
indices (4 in a 64K 4-way associative cache) and the tag may be extended
to bit 12 (and may overlap with the index).

Note that the I-cache on ARMv7 is an aliasing VIPT (when the way size >
PAGE_SIZE).

-- 
Catalin

^ permalink raw reply	[flat|nested] 16+ messages in thread

* ARM caches variants.
  2010-03-24  9:42         ` Catalin Marinas
@ 2010-03-26  5:45           ` Jamie Lokier
  2010-03-26 13:23             ` Catalin Marinas
  0 siblings, 1 reply; 16+ messages in thread
From: Jamie Lokier @ 2010-03-26  5:45 UTC (permalink / raw)
  To: linux-arm-kernel

Catalin Marinas wrote:
> On Tue, 2010-03-23 at 23:49 +0000, Jamie Lokier wrote:
> > Catalin Marinas wrote:
> > > > In other word, is not the cache line used by virtual address addr:
> > > > (addr % cache size) / (cache line size)
> > >
> > > With any cache line, you have an index and a tag for identifying it. The
> > > cache may have multiple ways (e.g. 4-way associative) to speed up the
> > > look-up. For a 32KB 4-way associative cache you have 8KB per way (2^13).
> > >
> > > If the cache line size is 32B (2^5), the index of a cache line is:
> > >
> > > addr & (2^13 - 1) >> 5
> > >
> > > e.g. bits 12..5 from the VA are used for indexing the cache line.
> > >
> > > The tag is given by the rest of the top bits, in the above case bits
> > > 31..13 of the VA (if VIVT cache) or PA (VIPT cache).
> > >
> > > The cache look-up for a VA goes something like this:
> > >
> > >      1. extracts the index. With a 4-way associative cache there are 4
> > >         possible cache lines for this index
> > >      2. extracts the tag (from either VA or PA, depending on the cache
> > >         type). For VIPT caches, it needs to do a TLB look-up as well to
> > >         find the physical address
> > >      3. check the four cache lines identified by the index at step 1
> > >         against their tag
> > >      4. if the tag matches, you get a hit, otherwise a miss
> > >
> > > For your #2 and #3 issues, if two processes map the same PA using
> > > different VAs, data can end up pretty much anywhere in a VIVT cache. If
> > > you calculate the index and tag (used to identify a cache line) for two
> > > different VAs, the only common part are bits 11..5 of the index (since
> > > they are inside a page). If you want to have the same index and tag for
> > > the two different VAs, you end up with having to use the same VA in both
> > > processes.
> > >
> > > With VIPT caches, the tag is the same for issues #2 and #3. The only
> > > difference may be in a few top bits of the index. In the above case,
> > > it's bit 12 of the VA which may differ. This gives you two page colours
> > > (with 64KB 4-way associative cache you have 2 bits for the colour
> > > resulting in 4 colours).
> > 
> > That's a very helpful explanation, thank you.
> > 
> > Am I to understand that "VIPT aliasing" means there are some of those
> > bits and therefore >= 2 colours, and "VIPT non-aliasing" means the
> > cache size / ways is <= PAGE_SIZE, and therefore has effectively 1 colour?
> 
> A method to get non-aliasing VIPT is to have the way size <= PAGE_SIZE.
> That's how ARM1136 with 16K caches works. But with bigger caches, adding
> more ways may get expensive in hardware.
> 
> > I suspect some x86s have VIPT caches, especially AMD (I've seen timing
> > measurements which clearly show page colour effects), and I can only
> > imagine that aliasing is prevent by when a cache line requests to be
> > filled from higher level cache (L2), something very similar to SMP
> > MESI cache coherence gets involved to keep both lines consistent.
> > 
> > That would make a "VIPT non-aliasing" cache that has multiple colours.
> > Is that ever done on the ARM architecture?
> 
> ARMv7 has non-aliasing VIPT D-cache where the aliasing is handled by the
> hardware (maybe similar to MESI). I don't know the hardware
> implementation but my guess is that a cache look-up checks all the
> indices (4 in a 64K 4-way associative cache) and the tag may be extended
> to bit 12 (and may overlap with the index).

Loading a new cache line, or writing through it, must reach the other
indices somehow to avoid aliases.

Two kinds of behaviour come to mind:

  - Loading / writing through causes a clean+flush of all aliasing indices.
  - Loading permits multiple indices in S-state, as a MESI cache would.

I'm not sure if the difference affects what you need to do for
explicit cache flushes for DMA etc., or if it's just a
timing/performance difference.

> Note that the I-cache on ARMv7 is an aliasing VIPT (when the way size >
> PAGE_SIZE).

Aliases in a read-only cache (I-cache) don't matter, so I presume you
mean it has multiple aliases against the D-cache?

I think that would only affect what you have to do when flushing
I-cache lines after writing data, and then only if flushing has to use
the virtual address, not physical.  Is that right?

Thanks again,
-- Jamie

^ permalink raw reply	[flat|nested] 16+ messages in thread

* ARM caches variants.
  2010-03-26  5:45           ` Jamie Lokier
@ 2010-03-26 13:23             ` Catalin Marinas
  0 siblings, 0 replies; 16+ messages in thread
From: Catalin Marinas @ 2010-03-26 13:23 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, 2010-03-26 at 05:45 +0000, Jamie Lokier wrote:
> Catalin Marinas wrote:
> > Note that the I-cache on ARMv7 is an aliasing VIPT (when the way size >
> > PAGE_SIZE).
> 
> Aliases in a read-only cache (I-cache) don't matter, so I presume you
> mean it has multiple aliases against the D-cache?
> 
> I think that would only affect what you have to do when flushing
> I-cache lines after writing data, and then only if flushing has to use
> the virtual address, not physical.  Is that right?

Flushing the L1 cache has to use the virtual address even on PIPT
caches. In the Linux case, you can't use the kernel linear mapping to
invalidate an I-cache line if the intention is to use the code in user
space with a different virtual address.

-- 
Catalin

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2010-03-26 13:23 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-03-23 12:39 ARM caches variants Gilles Chanteperdrix
2010-03-23 12:53 ` Catalin Marinas
2010-03-23 13:15   ` Gilles Chanteperdrix
2010-03-23 13:42     ` Catalin Marinas
2010-03-23 13:59       ` Gilles Chanteperdrix
2010-03-23 14:33         ` Catalin Marinas
2010-03-23 14:39           ` Gilles Chanteperdrix
2010-03-23 14:46             ` Catalin Marinas
2010-03-23 14:47               ` Catalin Marinas
2010-03-23 14:49               ` Gilles Chanteperdrix
2010-03-23 23:39               ` Jamie Lokier
2010-03-24  9:33                 ` Catalin Marinas
2010-03-23 23:49       ` Jamie Lokier
2010-03-24  9:42         ` Catalin Marinas
2010-03-26  5:45           ` Jamie Lokier
2010-03-26 13:23             ` Catalin Marinas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).