public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Re: PAT support
@ 2004-04-15 21:38 Albert Cahalan
  0 siblings, 0 replies; 20+ messages in thread
From: Albert Cahalan @ 2004-04-15 21:38 UTC (permalink / raw)
  To: linux-kernel mailing list; +Cc: ebiederm, ak, tripperda

Eric W. Biederman writes:
> Andi Kleen <ak@muc.de> writes:

>> Yes agreed. I already had vendors complaining about this.
>> But for this it will need some more work - the MTRRs need to be fully
>> converted to PAT and then disabled (because MTRRs have 
>> higher priority than PAT). Doing so is a lot more risky than 
>> what Terrence's patch does currently though.  But longer term
>> we will need it.
>
> Ugh.  You are right.  The processors look at the two types and pick
> the one that caches the least.  So PAT can't enable caching :(

There's more to it than this. You need to use both
the MTRRs and PAT for best performance. I can't find
the explanation in my AMD manual, so maybe this is
an Intel-only thing. From (human) memory:

Use the PAT stuff as your primary cache-control
mechanism. Then, to the extent that you can, use
the MTRRs to double-mark some of the uncached or
uncachable memory. This avoids some sort of
useless bus traffic or TLB goings-on.

Sorry I can't be clearer; check the Intel books.




^ permalink raw reply	[flat|nested] 20+ messages in thread
* Re: PAT support
@ 2004-04-13  5:34 Manfred Spraul
  2004-04-13 14:02 ` Pavel Machek
                   ` (2 more replies)
  0 siblings, 3 replies; 20+ messages in thread
From: Manfred Spraul @ 2004-04-13  5:34 UTC (permalink / raw)
  To: Terence Ripperda; +Cc: linux-kernel

Hi Terence,

in your patch, you write
+/* Here is the PAT's default layout on ia32 cpus when we are done.
+ * PAT0: Write Back
+ * PAT1: Write Combine
+ * PAT2: Uncached
+ * PAT3: Uncacheable
+ * PAT4: Write Through
+ * PAT5: Write Protect
+ * PAT6: Uncached
+ * PAT7: Uncacheable

Is that layout possible?
There is an errata in the B2 and C1 stepping of the Pentium 4 cpus that 
results in incorrect PAT numbers: the highest bit is ignored by the CPU 
under some circumstances. There's a similar errata (E27) that affects 
all Pentium 3 cpus: The highest bit is always ignored.
I think we need a fallback to 4 PAT entries.

--
    Manfred


^ permalink raw reply	[flat|nested] 20+ messages in thread
* PAT support
@ 2004-04-12 22:29 Terence Ripperda
  2004-04-13  8:36 ` Andy Whitcroft
  0 siblings, 1 reply; 20+ messages in thread
From: Terence Ripperda @ 2004-04-12 22:29 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: tripperda

[-- Attachment #1: Type: text/plain, Size: 2301 bytes --]

Hi all,

quite a while back, I sent out email about adding Page Attribute Table support to the kernel (http://www.ker
neltraffic.org/kernel-traffic/kt20030616_219.html#3).

At the time, the concern was being able to mark remapped i/o pages Write Combined in the case that we ran out of MTRRs to do so. This was mainly for the agp aperture and framebuffer. With PCI Express systems coming out, the need changes slightly. PCI Express does not have a centralized aperture like agp that can be marked WC. Instead, individual system pages of memory need to be marked WC via PAT in the page tables. This significantly increases the need for PAT support under linux to maintain high performance levels on PCI Express systems.

I thought the best approach would be to handle some of the original feedback in the code before I came backto ping lkml. I discussed things a little offline with Andi Kleen. He suggested I focus on the simpler i/o regions first, then come back to handling main memory once I had that done and had gotten feedback on it.

I've worked on a mechanism (cachemap) to track what type of caching a region of memory is currently mapped as. when a new region of memory is mapped, cachemap is queried to make sure the new region's caching type matches the old type (or is compatible with the old type). if the cachemap query succeeds, it's safe to map the new i/o region, otherwise it's not safe.

for the first pass, I focused on testing ioremap. so the cachemap queries are only made from ioremap. I also added code to have the mtrr code call the cachemap code (in this case, it's a report rather than a query to indicate the mapping's already made). I've made a few test runs on systems here, and it seems to work fairly well.

this current patch includes the original PAT support and the new cachemap mechanism. note that the cachemap mechanism does not actually change any caching attributes, it only keeps track of the attributes and tests regions. I think the end idea would be that drivers would use the normal ioremap/change_page_attr/remap_page_range mechanisms like they already do, and these mechanisms would in turn use cachemap to make sure there's no conflicts. I'm completely open to how any specific details should work, and any changes needed to be made.

Thanks,
Terence


[-- Attachment #2: cachemap-1.9-2.6.4.patch.bz2 --]
[-- Type: application/octet-stream, Size: 9590 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2004-04-22  4:22 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <1KifY-uA-7@gated-at.bofh.it>
2004-04-13  0:01 ` PAT support Andi Kleen
2004-04-13 16:21   ` Terence Ripperda
2004-04-14  0:58     ` Andi Kleen
2004-04-16 18:07       ` Terence Ripperda
2004-04-17  0:42         ` Andi Kleen
2004-04-19 22:54           ` Terence Ripperda
2004-04-20 18:51             ` Andi Kleen
2004-04-21 23:19               ` Terence Ripperda
2004-04-22  4:21                 ` Andi Kleen
2004-04-15  4:11   ` Eric W. Biederman
2004-04-15 16:38     ` Andi Kleen
2004-04-15 18:39       ` Eric W. Biederman
2004-04-15 21:38 Albert Cahalan
  -- strict thread matches above, loose matches on Subject: below --
2004-04-13  5:34 Manfred Spraul
2004-04-13 14:02 ` Pavel Machek
2004-04-13 16:40 ` Terence Ripperda
2004-04-15  4:05 ` Eric W. Biederman
2004-04-12 22:29 Terence Ripperda
2004-04-13  8:36 ` Andy Whitcroft
2004-04-13 16:50   ` Terence Ripperda

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox