* tlb magic @ 2005-06-13 14:06 Mad Props 2005-06-13 20:59 ` Dominic Sweetman 0 siblings, 1 reply; 7+ messages in thread From: Mad Props @ 2005-06-13 14:06 UTC (permalink / raw) To: linux-mips Hi, I'm trying to understand how to implement an TLB Exception handler for a MIPS32 ( 4KC ). As far as I got it, it makes sense to locate the user process page tables in kseg2 to save physical memory. The book I'm reading states another advantage using kseg2. I'm not quite sure what they mean, stating that "It provides an easy mechanism for remapping a new user page table when changing context, without having to find enough virtual addresses in the OS to map all the page tables at once. Instead, you just change the ASID value, and the kseg2 pointer to the page table is now automatically remapped onto the correct page table. It's nearly magic." 1. Is there only one kseg2 containing all page tables for 256 processes, i.e. only one ASID is used or 2. Has each page table it's own address space ( using different ASID for those addresses in kseg2 ) 3. Will I need another untranslated page table in kseg0/kseg1 to translate kseg2 addresses ? 4. What is this kseg2 pointer they are talking about ? 5. Are they talking about the ASID in EntryHi ? 6. Where is the magic ? Would be smashing if anybody could help me out. Kind regards, Thomas -- Geschenkt: 3 Monate GMX ProMail gratis + 3 Ausgaben stern gratis ++ Jetzt anmelden & testen ++ http://www.gmx.net/de/go/promail ++ ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: tlb magic 2005-06-13 14:06 tlb magic Mad Props @ 2005-06-13 20:59 ` Dominic Sweetman 2005-06-14 16:00 ` madprops 0 siblings, 1 reply; 7+ messages in thread From: Dominic Sweetman @ 2005-06-13 20:59 UTC (permalink / raw) To: Mad Props; +Cc: linux-mips Thomas, > I'm trying to understand how to implement an TLB Exception handler for a > MIPS32 ( 4KC ). As far as I got it, it makes sense to locate the user > process page tables in kseg2 to save physical memory. The book I'm reading > states another advantage using kseg2. I'm not quite sure what they mean, > stating that > > "It provides an easy mechanism for remapping a new user page table when > changing context, without having to find enough virtual addresses in the OS > to map all the page tables at once. Instead, you just change the ASID value, > and the kseg2 pointer to the page table is now automatically remapped onto > the correct page table. It's nearly magic." Sounds familiar. Are you reading "See MIPS Run"? If so, it has pictures and further explanation. If not - well, no wonder you're confused (run down to Amazon and see if they have any copies left!) > 1. Is there only one kseg2 containing all page tables for 256 processes, > i.e. only one ASID is used or > > 2. Has each page table it's own address space ( using different ASID for > those addresses in kseg2 ) MIPS TLBs can be loaded with "global" entries which map regardless of ASID. Linux (which doesn't use kseg2 for page tables) only ever uses global mappings to kseg2, which is therefore a shared space for all kernel threads. I think the early BSD ports, for which the kseg2 trick was invented, did use per-process mappings in kseg2 for BSD's per-process data area and the page table. The idea of using software to maintain the TLB freaked out potential customers back in 1987, so it was important to be able to show them a very short user-address TLB miss handler. > 3. Will I need another untranslated page table in kseg0/kseg1 to translate > kseg2 addresses ? Well, of course you don't *need* any particular format of page table, it's all done by software. The only constraint here is that while special tricks on the R3000 allow the user-mode-address TLB-miss handler to take a nested exception (to fix up a missing translation for the page table), those tricks won't work for the kernel-mode-address TLB miss handler. The BSD systems used kseg0 information to resolve kseg2 translation misses, or kept the crucial 2nd-level information in places accessible through 'wired' (non-replaceable) TLB entries. > 4. What is this kseg2 pointer they are talking about ? It's probably a reference to the base of the page table, which is kept in the high-order bits of the CP0 register "Context". > 5. Are they talking about the ASID in EntryHi ? Yes. The ASID in EntryHi does double-duty: it is the "current" ASID for the running process, and also the place where the ASID field of a TLB entry gets lodged when the TLB is read/written. > 6. Where is the magic ? In the eye of the beholder. Was that any help? -- Dominic Sweetman MIPS Technologies ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: tlb magic 2005-06-13 20:59 ` Dominic Sweetman @ 2005-06-14 16:00 ` madprops 2005-06-14 17:18 ` Ralf Baechle 2005-06-25 5:51 ` Dominic Sweetman 0 siblings, 2 replies; 7+ messages in thread From: madprops @ 2005-06-14 16:00 UTC (permalink / raw) To: linux-mips > > Thomas, > > > I'm trying to understand how to implement an TLB Exception handler for a > > MIPS32 ( 4KC ). As far as I got it, it makes sense to locate the user > > process page tables in kseg2 to save physical memory. The book I'm > reading > > states another advantage using kseg2. I'm not quite sure what they mean, > > stating that > > > > "It provides an easy mechanism for remapping a new user page table when > > changing context, without having to find enough virtual addresses in the > OS > > to map all the page tables at once. Instead, you just change the ASID > value, > > and the kseg2 pointer to the page table is now automatically remapped > onto > > the correct page table. It's nearly magic." > > Sounds familiar. Are you reading "See MIPS Run"? If so, it has > pictures and further explanation. If not - well, no wonder you're > confused (run down to Amazon and see if they have any copies left!) > > > 1. Is there only one kseg2 containing all page tables for 256 processes, > > i.e. only one ASID is used or > > > > 2. Has each page table it's own address space ( using different ASID for > > those addresses in kseg2 ) > > MIPS TLBs can be loaded with "global" entries which map regardless of > ASID. Linux (which doesn't use kseg2 for page tables) only ever uses > global mappings to kseg2, which is therefore a shared space for all > kernel threads. > > I think the early BSD ports, for which the kseg2 trick was invented, > did use per-process mappings in kseg2 for BSD's per-process data area > and the page table. The idea of using software to maintain the TLB > freaked out potential customers back in 1987, so it was important to > be able to show them a very short user-address TLB miss handler. > > > 3. Will I need another untranslated page table in kseg0/kseg1 to > translate > > kseg2 addresses ? > > Well, of course you don't *need* any particular format of page table, > it's all done by software. The only constraint here is that while > special tricks on the R3000 allow the user-mode-address TLB-miss > handler to take a nested exception (to fix up a missing translation > for the page table), those tricks won't work for the > kernel-mode-address TLB miss handler. > > The BSD systems used kseg0 information to resolve kseg2 > translation misses, or kept the crucial 2nd-level information in > places accessible through 'wired' (non-replaceable) TLB entries. > > > 4. What is this kseg2 pointer they are talking about ? > > It's probably a reference to the base of the page table, which is kept > in the high-order bits of the CP0 register "Context". > > > 5. Are they talking about the ASID in EntryHi ? > > Yes. The ASID in EntryHi does double-duty: it is the "current" ASID > for the running process, and also the place where the ASID field of a > TLB entry gets lodged when the TLB is read/written. > > > 6. Where is the magic ? > > In the eye of the beholder. > > Was that any help? > > -- > Dominic Sweetman > MIPS Technologies > Hi, yes, I'm reading "See MIPS Run". So thanks for the online support that comes with it. Now, if I got it correctly, the exception routing described in section 6.7 uses per-process mappings for kseg2, i.e. that e.g. the first 2MB of (each) kseg2 are used as page table of the corresponding process and maybe another few kb for process related stuff. Provided the page tables are continuously at the same address ( e.g. KSEG2_BASE ) a change of ASID in EntryHi would indeed make a change of the kseg2 pointer in Context unnecessary ( it always points to KSEG2_BASE ). The mapping of kseg2 would automatically change as the global bit is set to zero. Using the standard page table approach I would now need an additional page table for each process in order to map those 2+x MB in kseg2 which I could put in kseg0/1 or in kseg2 with 'wired' TLB entries. If that's the way to go - why is it only used in early BSD ports of like 1987 ? Are there any troubles with it or have other mechanisms turned out to be better for any reason ? Regards, Thomas -- Geschenkt: 3 Monate GMX ProMail gratis + 3 Ausgaben stern gratis ++ Jetzt anmelden & testen ++ http://www.gmx.net/de/go/promail ++ ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: tlb magic 2005-06-14 16:00 ` madprops @ 2005-06-14 17:18 ` Ralf Baechle 2005-06-25 5:51 ` Dominic Sweetman 1 sibling, 0 replies; 7+ messages in thread From: Ralf Baechle @ 2005-06-14 17:18 UTC (permalink / raw) To: madprops; +Cc: linux-mips On Tue, Jun 14, 2005 at 06:00:26PM +0200, madprops@gmx.net wrote: > yes, I'm reading "See MIPS Run". So thanks for the online support that comes > with it. Now, if I got it correctly, the exception routing described in > section 6.7 uses per-process mappings for kseg2, i.e. that e.g. the first > 2MB of (each) kseg2 are used as page table of the corresponding process and > maybe another few kb for process related stuff. Provided the page tables are > continuously at the same address ( e.g. KSEG2_BASE ) a change of ASID in > EntryHi would indeed make a change of the kseg2 pointer in Context > unnecessary ( it always points to KSEG2_BASE ). The mapping of kseg2 would > automatically change as the global bit is set to zero. > > Using the standard page table approach I would now need an additional page > table for each process in order to map those 2+x MB in kseg2 which I could > put in kseg0/1 or in kseg2 with 'wired' TLB entries. > > If that's the way to go - why is it only used in early BSD ports of like > 1987 ? Are there any troubles with it or have other mechanisms turned out to > be better for any reason ? I don't know the details of how it was used in BSD. But this is how very early Linux/MIPS kernels were doing it on R4000 class processors: - the entire 4MB of pagetables are mapped into KSEG2 at 0xe4000000 - Linux likes to think of pagetables as 2-level trees (simplifying things a little here). - So at 4kB pagesize and 4 byte entries for each page the root of the tree will end up at root = (base + (base >> (12 - 2)) where base is 0xe4000000; 12 the log2 of the pagesize and 2 the log2 of 4. So root compute to 0xe4390000. Now let's see how we handle a pagefault in this scheme: - we take an exception and go to the reload handler at 0x80000000. - The CPU tries to help us [1] by with the value in the context register which with a little munging (see [1]) we use to index the 4MB of pagetables and load the right pagetable entry, then eret. That's the fast path. Now for the slow path. We enter it if indexing the mapped pagetable array at 0xe4000000 results in a TLB miss exception. But we're already in a TLB exception handler running with the EXL flag in the status register set: - we jump to 0x80000180 - we see it's a TLB exception, so branch to the TLB exception handler - The TLB exception handler figures out what kind of work it has to do. I only cover the TLB reload case here. - By now we know it must have been an access to the pagetable mapping that has failed. - We start all over by first indexing the 4kB of the root at 0xe439000 with the upper 10 bits of the virtual address and loading the entry found there into the TLB. At this point we can guarantee that if we resume execution will take an exception again but we'll only use the fast path part of the handlers. So I guess by this point you're asking why this magic address for the TLB root. As mentioned previously Linux consideres pagetables a two level tree and the root of that tree (the first level) data structure happens to be suitable as the pagetable to map the 4MB of second level data. So on a context switch all that's needed is swapping the content of this one wired entry holding the root pointer and ASID and voilla, we've magically changed the mappings for the entire 4MB of pagetables. I eventually removed that code because it was resulting in cache aliases and felt that fixing them would eleminate the performance advantage of this relativly complicated scheme. It certainly was too funky for an early stage OS and we may reconsider. Ralf [1] but doesn't really succeed because for 4-byte pagetable entries the values in the context and xcontext registers are not formed the way we'd prefer them ... ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: tlb magic 2005-06-14 16:00 ` madprops 2005-06-14 17:18 ` Ralf Baechle @ 2005-06-25 5:51 ` Dominic Sweetman 2005-06-25 14:41 ` Ralf Baechle 1 sibling, 1 reply; 7+ messages in thread From: Dominic Sweetman @ 2005-06-25 5:51 UTC (permalink / raw) To: madprops; +Cc: linux-mips Long ago... > yes, I'm reading "See MIPS Run". So thanks for the online support that comes > with it. Now, if I got it correctly, the exception routing described in > section 6.7 uses per-process mappings for kseg2, i.e. that e.g. the first > 2MB of (each) kseg2 are used as page table of the corresponding process and > maybe another few kb for process related stuff. Provided the page tables are > continuously at the same address ( e.g. KSEG2_BASE ) a change of ASID in > EntryHi would indeed make a change of the kseg2 pointer in Context > unnecessary ( it always points to KSEG2_BASE ). The mapping of kseg2 would > automatically change as the global bit is set to zero. Yes. I think I recall that the first BSD4.3 ports for MIPS had a fixed-virtual address per-process structure which was extended to include the L2 page table. > Using the standard page table approach I would now need an additional page > table for each process in order to map those 2+x MB in kseg2 which I could > put in kseg0/1 or in kseg2 with 'wired' TLB entries. > > If that's the way to go - why is it only used in early BSD ports of like > 1987 ? Are there any troubles with it or have other mechanisms turned out to > be better for any reason ? It's rather a lot of assumptions to build into architecture-dependent code, not very flexible, not very SMP-friendly, and in other ways not as scalable as one would like. Current Linux systems accept more computation in the TLB miss handler in order to use largely portable data structures for keeping page tables. You can always push at that trade-off... -- Dominic Sweetman MIPS Technologies ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: tlb magic 2005-06-25 5:51 ` Dominic Sweetman @ 2005-06-25 14:41 ` Ralf Baechle 2005-06-27 12:04 ` Maciej W. Rozycki 0 siblings, 1 reply; 7+ messages in thread From: Ralf Baechle @ 2005-06-25 14:41 UTC (permalink / raw) To: Dominic Sweetman; +Cc: madprops, linux-mips On Sat, Jun 25, 2005 at 06:51:22AM +0100, Dominic Sweetman wrote: > Current Linux systems accept more computation in the TLB miss > handler in order to use largely portable data structures for keeping > page tables. You can always push at that trade-off... Further tuning is on the Linux agenda. Right now we've got a rather fancy implementation of a slow (relativly speaking) but portable algorithm. The most useful useful trick of all will be increasing the pagesize to grow beyond the small pagesize of 4k - for expected significant performance benefits because the the TLB reach will increase but also virtual aliases will go away on about anything but R4000SC returning us to the promised lands of simplicity of cache managment :-) Ralf ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: tlb magic 2005-06-25 14:41 ` Ralf Baechle @ 2005-06-27 12:04 ` Maciej W. Rozycki 0 siblings, 0 replies; 7+ messages in thread From: Maciej W. Rozycki @ 2005-06-27 12:04 UTC (permalink / raw) To: Ralf Baechle; +Cc: Dominic Sweetman, madprops, linux-mips On Sat, 25 Jun 2005, Ralf Baechle wrote: > The most useful useful trick of all will be increasing the pagesize to > grow beyond the small pagesize of 4k - for expected significant > performance benefits because the the TLB reach will increase but also > virtual aliases will go away on about anything but R4000SC returning us > to the promised lands of simplicity of cache managment :-) But that we have already done, haven't we? ;-) Maciej ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2005-06-27 12:05 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2005-06-13 14:06 tlb magic Mad Props 2005-06-13 20:59 ` Dominic Sweetman 2005-06-14 16:00 ` madprops 2005-06-14 17:18 ` Ralf Baechle 2005-06-25 5:51 ` Dominic Sweetman 2005-06-25 14:41 ` Ralf Baechle 2005-06-27 12:04 ` Maciej W. Rozycki
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox