* Question about Context register in TLB refilling @ 2010-10-17 15:51 wilbur.chan 2010-10-17 17:50 ` Kevin Cernekee 0 siblings, 1 reply; 10+ messages in thread From: wilbur.chan @ 2010-10-17 15:51 UTC (permalink / raw) To: Linux MIPS Mailing List; +Cc: chelly wilbur Hi all, I have some questions concerning TLB refilling mechanism on mips: 1) In linux ,esspecially in TLB refilling, is Context[PTEBase] used to store cpuid? (refer to build_get_pgde32 in tlbex.c) 2) In function of build_get_ptep, when generating a pte offset , is Context[BadVPN] replaceable by BadVAddr register? Thanks in advance ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Question about Context register in TLB refilling 2010-10-17 15:51 Question about Context register in TLB refilling wilbur.chan @ 2010-10-17 17:50 ` Kevin Cernekee 2010-10-17 19:33 ` Maciej W. Rozycki 0 siblings, 1 reply; 10+ messages in thread From: Kevin Cernekee @ 2010-10-17 17:50 UTC (permalink / raw) To: wilbur.chan; +Cc: Linux MIPS Mailing List On Sun, Oct 17, 2010 at 8:51 AM, wilbur.chan <wilbur512@gmail.com> wrote: > 1) In linux ,esspecially in TLB refilling, is Context[PTEBase] used > to store cpuid? (refer to build_get_pgde32 in tlbex.c) On 32-bit systems, PTEBase stores a byte offset that can be added to &pgd_current[0]. i.e. smp_processor_id() * sizeof(unsigned long) So the TLB refill handler can find pgd for the current CPU using code that looks something like this: 0: 401b2000 mfc0 k1,c0_context 4: 3c1a8054 lui k0,0x8054 8: 001bddc2 srl k1,k1,0x17 c: 035bd821 addu k1,k0,k1 ... 14: 8f7b5008 lw k1,20488(k1) where pgd_current is at 0x8054_5008, and PTEBase is 0, 4, 8, 12, ... See also: TLBMISS_HANDLER_SETUP(). For 64-bit systems with CONFIG_MIPS_PGD_C0_CONTEXT, it looks like a direct pgd pointer is now being stored in Context, to speed up the TLB handlers. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Question about Context register in TLB refilling 2010-10-17 17:50 ` Kevin Cernekee @ 2010-10-17 19:33 ` Maciej W. Rozycki 2010-10-17 20:52 ` Kevin Cernekee 2010-10-18 0:00 ` Ralf Baechle 0 siblings, 2 replies; 10+ messages in thread From: Maciej W. Rozycki @ 2010-10-17 19:33 UTC (permalink / raw) To: Kevin Cernekee; +Cc: wilbur.chan, Linux MIPS Mailing List On Sun, 17 Oct 2010, Kevin Cernekee wrote: > > 1) In linux ,esspecially in TLB refilling, is Context[PTEBase] used > > to store cpuid? (refer to build_get_pgde32 in tlbex.c) > > On 32-bit systems, PTEBase stores a byte offset that can be added to > &pgd_current[0]. i.e. smp_processor_id() * sizeof(unsigned long) > > So the TLB refill handler can find pgd for the current CPU using code > that looks something like this: > > 0: 401b2000 mfc0 k1,c0_context > 4: 3c1a8054 lui k0,0x8054 > 8: 001bddc2 srl k1,k1,0x17 > c: 035bd821 addu k1,k0,k1 > ... > 14: 8f7b5008 lw k1,20488(k1) > > where pgd_current is at 0x8054_5008, and PTEBase is 0, 4, 8, 12, ... It has been always making me wonder (though not as much to go and dig through our code ;) ) why Linux is uncapable of using the value presented by the CPU in the CP0 Context register as is, or perhaps after a trivial operation such as a left-shift by a constant number of bits (where the size of the page entry slot assumed by hardware turned out too small). There should be no need to add another constant as in the piece of code you have quoted -- this constant should already have been preloaded to this register when switching the context the last time. The design of the TLB refill exception in the MIPS Architecture has been such as to allow this register to be readily used as an address into the page table. Hmm... Maciej ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Question about Context register in TLB refilling 2010-10-17 19:33 ` Maciej W. Rozycki @ 2010-10-17 20:52 ` Kevin Cernekee 2010-10-17 21:56 ` Maciej W. Rozycki 2010-10-18 0:00 ` Ralf Baechle 1 sibling, 1 reply; 10+ messages in thread From: Kevin Cernekee @ 2010-10-17 20:52 UTC (permalink / raw) To: Maciej W. Rozycki; +Cc: wilbur.chan, Linux MIPS Mailing List On Sun, Oct 17, 2010 at 12:33 PM, Maciej W. Rozycki <macro@linux-mips.org> wrote: >> where pgd_current is at 0x8054_5008, and PTEBase is 0, 4, 8, 12, ... > > It has been always making me wonder (though not as much to go and dig > through our code ;) ) why Linux is uncapable of using the value presented > by the CPU in the CP0 Context register as is, or perhaps after a trivial > operation such as a left-shift by a constant number of bits (where the > size of the page entry slot assumed by hardware turned out too small). > There should be no need to add another constant as in the piece of code > you have quoted -- this constant should already have been preloaded to > this register when switching the context the last time. The design of the > TLB refill exception in the MIPS Architecture has been such as to allow > this register to be readily used as an address into the page table. On plain old 32-bit MIPS: The pgd entry for "va" is at address: (unsigned long)pgd + ((va >> 22) << 2) i.e. each 4-byte entry in the pgd table represents 4MB of virtual address space. PTEBase only gives you 9 bits to work with. If you use it to store pgd[31:23] directly, that means every pgd needs to be 8MB-aligned - ouch. You could potentially use PTEBase to store more of the significant bits, e.g. pgd = 0x8000_0000 | (PTEBase << 12) But that still places limits on where the pgd table can be stored, and probably adds a decent number of extra arithmetic operations to each handler. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Question about Context register in TLB refilling 2010-10-17 20:52 ` Kevin Cernekee @ 2010-10-17 21:56 ` Maciej W. Rozycki 0 siblings, 0 replies; 10+ messages in thread From: Maciej W. Rozycki @ 2010-10-17 21:56 UTC (permalink / raw) To: Kevin Cernekee; +Cc: wilbur.chan, Linux MIPS Mailing List On Sun, 17 Oct 2010, Kevin Cernekee wrote: > On plain old 32-bit MIPS: > > The pgd entry for "va" is at address: (unsigned long)pgd + ((va >> 22) << 2) > > i.e. each 4-byte entry in the pgd table represents 4MB of virtual address space. > > PTEBase only gives you 9 bits to work with. If you use it to store > pgd[31:23] directly, that means every pgd needs to be 8MB-aligned - > ouch. Good point! I believe the original idea behind the Context and XContext registers was to put page tables somewhere within KSEG2/3 or XKSSEG which would make this alignment restriction not a problem, but I realise the overhead of placing page tables in paged memory may be higher than that of our current arrangement. I wonder however if such performance evaluation was actually ever made or whether it was the complexity of having page tables paged alone that scared people off. ;) Maciej ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Question about Context register in TLB refilling 2010-10-17 19:33 ` Maciej W. Rozycki 2010-10-17 20:52 ` Kevin Cernekee @ 2010-10-18 0:00 ` Ralf Baechle 2010-10-18 10:46 ` Gleb O. Raiko 2010-10-24 5:26 ` Maciej W. Rozycki 1 sibling, 2 replies; 10+ messages in thread From: Ralf Baechle @ 2010-10-18 0:00 UTC (permalink / raw) To: Maciej W. Rozycki; +Cc: Kevin Cernekee, wilbur.chan, Linux MIPS Mailing List On Sun, Oct 17, 2010 at 08:33:24PM +0100, Maciej W. Rozycki wrote: > > > 1) In linux ,esspecially in TLB refilling, is Context[PTEBase] used > > > to store cpuid? (refer to build_get_pgde32 in tlbex.c) > > > > On 32-bit systems, PTEBase stores a byte offset that can be added to > > &pgd_current[0]. i.e. smp_processor_id() * sizeof(unsigned long) > > > > So the TLB refill handler can find pgd for the current CPU using code > > that looks something like this: > > > > 0: 401b2000 mfc0 k1,c0_context > > 4: 3c1a8054 lui k0,0x8054 > > 8: 001bddc2 srl k1,k1,0x17 > > c: 035bd821 addu k1,k0,k1 > > ... > > 14: 8f7b5008 lw k1,20488(k1) > > > > where pgd_current is at 0x8054_5008, and PTEBase is 0, 4, 8, 12, ... > > It has been always making me wonder (though not as much to go and dig > through our code ;) ) why Linux is uncapable of using the value presented > by the CPU in the CP0 Context register as is, or perhaps after a trivial > operation such as a left-shift by a constant number of bits (where the > size of the page entry slot assumed by hardware turned out too small). > There should be no need to add another constant as in the piece of code > you have quoted -- this constant should already have been preloaded to > this register when switching the context the last time. The design of the > TLB refill exception in the MIPS Architecture has been such as to allow > this register to be readily used as an address into the page table. > Hmm... The design of the R4000 c0_context / c0_xcontext register assumes 8 byte ptes and a flat page table array. You can map the pagetables into virtual memory to get that and in fact very old Linux/MIPS versions did that but that approach may result in aliases on some processors so I eventually dropped it. The implementation requires nested TLB refill implementations and (Linux/MIPS was still using a.out in this days) I implemented a new relocation type to squeeze a cycle out of the slow path. The aliasing problem is solvable and it may be worth to revisit that old piece of code again now 15 years later. Ralf ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Question about Context register in TLB refilling 2010-10-18 0:00 ` Ralf Baechle @ 2010-10-18 10:46 ` Gleb O. Raiko 2010-10-18 12:48 ` Ralf Baechle 2010-10-24 5:26 ` Maciej W. Rozycki 1 sibling, 1 reply; 10+ messages in thread From: Gleb O. Raiko @ 2010-10-18 10:46 UTC (permalink / raw) To: Ralf Baechle Cc: Maciej W. Rozycki, Kevin Cernekee, wilbur.chan, Linux MIPS Mailing List On 18.10.2010 4:00, Ralf Baechle wrote: > The aliasing problem is solvable and it may be worth to revisit that old > piece of code again now 15 years later. Before anybody will start to prepare patches, I'd like to note using c0_context allows less than 128 processes (their mm contexts in fact but who cares) to be directly mapped on 32-bit cpus. So, some kind of caching needs to be implemented and it will add overhead on every mm switch. Sure, this overhead might be bounded for a real case where there is a small number of processes, so they all fit in the cache. --- Beware, wild assumptions here --- I'm afraid the cost of such caching still will be higher than loading pgd_current even from main memory on tlb refill. --- End of wild assumptions --- Gleb. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Question about Context register in TLB refilling 2010-10-18 10:46 ` Gleb O. Raiko @ 2010-10-18 12:48 ` Ralf Baechle 2010-10-18 14:03 ` Gleb O. Raiko 0 siblings, 1 reply; 10+ messages in thread From: Ralf Baechle @ 2010-10-18 12:48 UTC (permalink / raw) To: Gleb O. Raiko Cc: Maciej W. Rozycki, Kevin Cernekee, wilbur.chan, Linux MIPS Mailing List On Mon, Oct 18, 2010 at 02:46:02PM +0400, Gleb O. Raiko wrote: > On 18.10.2010 4:00, Ralf Baechle wrote: > >The aliasing problem is solvable and it may be worth to revisit that old > >piece of code again now 15 years later. > > Before anybody will start to prepare patches, I'd like to note using > c0_context allows less than 128 processes (their mm contexts in fact > but who cares) to be directly mapped on 32-bit cpus. So, some kind > of caching needs to be implemented and it will add overhead on every > mm switch. Sure, this overhead might be bounded for a real case > where there is a small number of processes, so they all fit in the > cache. > --- Beware, wild assumptions here --- > I'm afraid the cost of such caching still will be higher than > loading pgd_current even from main memory on tlb refill. > --- End of wild assumptions --- 64 context on R2000/R3000, 256 on everything else but R6000 and RM9000 series, 4096 contexts on RM9000 and that context caching is already there. It's fairly lightweight except in the rare case where the PID / ASID number overflows and a full TLB flush becomes necessary. A mm context switch only needs to reload the one wired TLB entry that maps the pagetables so that's not too bad. The ugly part are the nested TLB exceptions. I dumped that very early on when I realized the cache alias issues my implementation had so the earliest usable kernel versions had the tree walking reload handlers. That's why I don't have any benchmark results. Ralf ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Question about Context register in TLB refilling 2010-10-18 12:48 ` Ralf Baechle @ 2010-10-18 14:03 ` Gleb O. Raiko 0 siblings, 0 replies; 10+ messages in thread From: Gleb O. Raiko @ 2010-10-18 14:03 UTC (permalink / raw) To: Ralf Baechle Cc: Maciej W. Rozycki, Kevin Cernekee, wilbur.chan, Linux MIPS Mailing List On 18.10.2010 16:48, Ralf Baechle wrote: > On Mon, Oct 18, 2010 at 02:46:02PM +0400, Gleb O. Raiko wrote: > 64 context on R2000/R3000, 256 on everything else but R6000 and RM9000 > series, 4096 contexts on RM9000 and that context caching is already > there. It's fairly lightweight except in the rare case where the > PID / ASID number overflows and a full TLB flush becomes necessary. A > mm context switch only needs to reload the one wired TLB entry that maps > the pagetables so that's not too bad. Ralf, I counted from the opposite side. Size of KSEG2+KSEG3 is 1 GB, flat page table shall be 8 MB aligned to be stored in cp0 context, so we end up with 128 page tables in the theory (we have to reserve some space for other business too in practice). If we are going to use a "standard" approach when only current page table is mapped, we know the address at compile time and don't need cp0 context at all. We can even has as many page tables as number of ASIDs for cpus with multiple page sizes but cp0 context is still out of play anyway. Gleb. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Question about Context register in TLB refilling 2010-10-18 0:00 ` Ralf Baechle 2010-10-18 10:46 ` Gleb O. Raiko @ 2010-10-24 5:26 ` Maciej W. Rozycki 1 sibling, 0 replies; 10+ messages in thread From: Maciej W. Rozycki @ 2010-10-24 5:26 UTC (permalink / raw) To: Ralf Baechle; +Cc: Kevin Cernekee, wilbur.chan, Linux MIPS Mailing List On Mon, 18 Oct 2010, Ralf Baechle wrote: > The design of the R4000 c0_context / c0_xcontext register assumes 8 byte > ptes and a flat page table array. As I say you can increase the size by left-shifting the register. That's still cheaper than left-shifting and adding a 32-bit of worse yet a 64-bit base. Of course that implies higher yet an alignment and the PTE size of a power of two (which assuming at least a minimal level of sanity you want anyway). A flat structure is quite limiting (read: memory-greedy) indeed, but it looks to me with clever masking and shifting you should be able to get two-level page tables quite cheaply and easily too (on 32-bit). Hmm... > You can map the pagetables into virtual memory to get that and in fact > very old Linux/MIPS versions did that but that approach may result in > aliases on some processors so I eventually dropped it. The > implementation requires nested TLB refill implementations and > (Linux/MIPS was still using a.out in this days) I implemented a new > relocation type to squeeze a cycle out of the slow path. Nested refills shouldn't be too much of a problem, but cache aliases always ask for troubles, hmm... It may be worth investigating on processors with no aliases first, if at all. Maciej ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2010-10-24 5:26 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-10-17 15:51 Question about Context register in TLB refilling wilbur.chan 2010-10-17 17:50 ` Kevin Cernekee 2010-10-17 19:33 ` Maciej W. Rozycki 2010-10-17 20:52 ` Kevin Cernekee 2010-10-17 21:56 ` Maciej W. Rozycki 2010-10-18 0:00 ` Ralf Baechle 2010-10-18 10:46 ` Gleb O. Raiko 2010-10-18 12:48 ` Ralf Baechle 2010-10-18 14:03 ` Gleb O. Raiko 2010-10-24 5:26 ` Maciej W. Rozycki
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.