* [PATCH 2/4] powerpc: Fix 32 bits mm operations when not using BATs
@ 2007-03-22 2:59 Benjamin Herrenschmidt
2007-03-22 5:44 ` Benjamin Herrenschmidt
` (2 more replies)
0 siblings, 3 replies; 6+ messages in thread
From: Benjamin Herrenschmidt @ 2007-03-22 2:59 UTC (permalink / raw)
To: Paul Mackerras, linuxppc-dev
On hash table based 32 bits powerpc's, the hash management code runs with
a big spinlock. It's thus important that it never causes itself a hash
fault. That code is generally safe (it does memory accesses in real mode
among other things) with the exception of the actual access to the code
itself. That is, the kernel text needs to be accessible without taking
a hash miss exceptions.
This is currently guaranteed by having a BAT register mapping part of the
linear mapping permanently, which includes the kernel text. But this is
not true if using the "nobats" kernel command line option (which can be
useful for debugging) and will not be true when using DEBUG_PAGEALLOC
implemented in a subsequent patch.
This patch fixes this by pre-faulting in the hash table pages that hit
the kernel text, and making sure we never evict such a page under hash
pressure.
Signed-off-by: Benjamin Herrenchmidt <benh@kernel.crashing.org>
arch/powerpc/mm/hash_low_32.S | 22 ++++++++++++++++++++--
arch/powerpc/mm/mem.c | 3 ---
arch/powerpc/mm/mmu_decl.h | 3 +++
arch/powerpc/mm/pgtable_32.c | 11 +++++++----
4 files changed, 30 insertions(+), 9 deletions(-)
Index: linux-cell/arch/powerpc/mm/hash_low_32.S
===================================================================
--- linux-cell.orig/arch/powerpc/mm/hash_low_32.S 2007-03-22 13:09:34.000000000 +1100
+++ linux-cell/arch/powerpc/mm/hash_low_32.S 2007-03-22 13:15:08.000000000 +1100
@@ -283,6 +283,7 @@ Hash_msk = (((1 << Hash_bits) - 1) * 64)
#define PTEG_SIZE 64
#define LG_PTEG_SIZE 6
#define LDPTEu lwzu
+#define LDPTE lwz
#define STPTE stw
#define CMPPTE cmpw
#define PTE_H 0x40
@@ -389,13 +390,30 @@ _GLOBAL(hash_page_patch_C)
* and we know there is a definite (although small) speed
* advantage to putting the PTE in the primary PTEG, we always
* put the PTE in the primary PTEG.
+ *
+ * In addition, we skip any slot that is mapping kernel text in
+ * order to avoid a deadlock when not using BAT mappings if
+ * trying to hash in the kernel hash code itself after it has
+ * already taken the hash table lock. This works in conjunction
+ * with pre-faulting of the kernel text.
+ *
+ * If the hash table bucket is full of kernel text entries, we'll
+ * lockup here but that shouldn't happen
*/
- addis r4,r7,next_slot@ha
+
+1: addis r4,r7,next_slot@ha /* get next evict slot */
lwz r6,next_slot@l(r4)
- addi r6,r6,PTE_SIZE
+ addi r6,r6,PTE_SIZE /* search for candidate */
andi. r6,r6,7*PTE_SIZE
stw r6,next_slot@l(r4)
add r4,r3,r6
+ LDPTE r0,PTE_SIZE/2(r4) /* get PTE second word */
+ clrrwi r0,r0,12
+ lis r6,etext@h
+ ori r6,r6,etext@l /* get etext */
+ tophys(r6,r6)
+ cmpl cr0,r0,r6 /* compare and try again */
+ blt 1b
#ifndef CONFIG_SMP
/* Store PTE in PTEG */
Index: linux-cell/arch/powerpc/mm/pgtable_32.c
===================================================================
--- linux-cell.orig/arch/powerpc/mm/pgtable_32.c 2007-03-22 13:10:49.000000000 +1100
+++ linux-cell/arch/powerpc/mm/pgtable_32.c 2007-03-22 13:27:56.000000000 +1100
@@ -282,16 +282,19 @@ int map_page(unsigned long va, phys_addr
void __init mapin_ram(void)
{
unsigned long v, p, s, f;
+ int ktext;
s = mmu_mapin_ram();
v = KERNELBASE + s;
p = PPC_MEMSTART + s;
for (; s < total_lowmem; s += PAGE_SIZE) {
- if ((char *) v >= _stext && (char *) v < etext)
- f = _PAGE_RAM_TEXT;
- else
- f = _PAGE_RAM;
+ ktext = ((char *) v >= _stext && (char *) v < etext);
+ f = ktext ?_PAGE_RAM_TEXT : _PAGE_RAM;
map_page(v, p, f);
+#ifdef CONFIG_PPC_STD_MMU_32
+ if (ktext)
+ hash_preload(&init_mm, v, 0, 0x300);
+#endif
v += PAGE_SIZE;
p += PAGE_SIZE;
}
Index: linux-cell/arch/powerpc/mm/mem.c
===================================================================
--- linux-cell.orig/arch/powerpc/mm/mem.c 2007-03-22 13:09:34.000000000 +1100
+++ linux-cell/arch/powerpc/mm/mem.c 2007-03-22 13:15:08.000000000 +1100
@@ -58,9 +58,6 @@ int init_bootmem_done;
int mem_init_done;
unsigned long memory_limit;
-extern void hash_preload(struct mm_struct *mm, unsigned long ea,
- unsigned long access, unsigned long trap);
-
int page_is_ram(unsigned long pfn)
{
unsigned long paddr = (pfn << PAGE_SHIFT);
Index: linux-cell/arch/powerpc/mm/mmu_decl.h
===================================================================
--- linux-cell.orig/arch/powerpc/mm/mmu_decl.h 2007-03-22 13:09:34.000000000 +1100
+++ linux-cell/arch/powerpc/mm/mmu_decl.h 2007-03-22 13:15:08.000000000 +1100
@@ -31,6 +31,9 @@ extern void settlbcam(int index, unsigne
unsigned int size, int flags, unsigned int pid);
extern void invalidate_tlbcam_entry(int index);
+extern void hash_preload(struct mm_struct *mm, unsigned long ea,
+ unsigned long access, unsigned long trap);
+
extern int __map_without_bats;
extern unsigned long ioremap_base;
extern unsigned int rtas_data, rtas_size;
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 2/4] powerpc: Fix 32 bits mm operations when not using BATs
2007-03-22 2:59 [PATCH 2/4] powerpc: Fix 32 bits mm operations when not using BATs Benjamin Herrenschmidt
@ 2007-03-22 5:44 ` Benjamin Herrenschmidt
2007-03-22 11:55 ` Segher Boessenkool
2007-04-18 8:58 ` Milton Miller
2 siblings, 0 replies; 6+ messages in thread
From: Benjamin Herrenschmidt @ 2007-03-22 5:44 UTC (permalink / raw)
To: Paul Mackerras; +Cc: linuxppc-dev
> --- linux-cell.orig/arch/powerpc/mm/mmu_decl.h 2007-03-22 13:09:34.000000000 +1100
> +++ linux-cell/arch/powerpc/mm/mmu_decl.h 2007-03-22 13:15:08.000000000 +1100
> @@ -31,6 +31,9 @@ extern void settlbcam(int index, unsigne
> unsigned int size, int flags, unsigned int pid);
> extern void invalidate_tlbcam_entry(int index);
>
> +extern void hash_preload(struct mm_struct *mm, unsigned long ea,
> + unsigned long access, unsigned long trap);
> +
The above needs to be moved out of the #ifdef CONFIG_PPC32 it's in...
Ben.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 2/4] powerpc: Fix 32 bits mm operations when not using BATs
2007-03-22 2:59 [PATCH 2/4] powerpc: Fix 32 bits mm operations when not using BATs Benjamin Herrenschmidt
2007-03-22 5:44 ` Benjamin Herrenschmidt
@ 2007-03-22 11:55 ` Segher Boessenkool
2007-03-22 12:00 ` Benjamin Herrenschmidt
2007-04-18 8:58 ` Milton Miller
2 siblings, 1 reply; 6+ messages in thread
From: Segher Boessenkool @ 2007-03-22 11:55 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: linuxppc-dev, Paul Mackerras
> + * If the hash table bucket is full of kernel text entries, we'll
> + * lockup here but that shouldn't happen
On 32-bit we have room for one HPTE per real page -- assuming
the kernel text is fully enclosed inside one power-of-two
size and aligned range of address space (i.e. always ;-) ),
a lockup here _cannot_ happen. It doesn't get tight until you
start using all of memory as kernel text.
So replace this comment with something like "the HTAB is
sized such that no PTEG can overflow from kernel text alone"?
It sounds a lot less hand-wavy than "but this shouldn't happen".
Segher
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 2/4] powerpc: Fix 32 bits mm operations when not using BATs
2007-03-22 11:55 ` Segher Boessenkool
@ 2007-03-22 12:00 ` Benjamin Herrenschmidt
2007-03-22 12:01 ` Benjamin Herrenschmidt
0 siblings, 1 reply; 6+ messages in thread
From: Benjamin Herrenschmidt @ 2007-03-22 12:00 UTC (permalink / raw)
To: Segher Boessenkool; +Cc: linuxppc-dev, Paul Mackerras
On Thu, 2007-03-22 at 12:55 +0100, Segher Boessenkool wrote:
> > + * If the hash table bucket is full of kernel text entries, we'll
> > + * lockup here but that shouldn't happen
>
> On 32-bit we have room for one HPTE per real page -- assuming
> the kernel text is fully enclosed inside one power-of-two
> size and aligned range of address space (i.e. always ;-) ),
> a lockup here _cannot_ happen. It doesn't get tight until you
> start using all of memory as kernel text.
I know, which is why "shouldn't" is a good enough choice of word :-)
> So replace this comment with something like "the HTAB is
> sized such that no PTEG can overflow from kernel text alone"?
> It sounds a lot less hand-wavy than "but this shouldn't happen".
Patch welcome :-)
Ben.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 2/4] powerpc: Fix 32 bits mm operations when not using BATs
2007-03-22 12:00 ` Benjamin Herrenschmidt
@ 2007-03-22 12:01 ` Benjamin Herrenschmidt
0 siblings, 0 replies; 6+ messages in thread
From: Benjamin Herrenschmidt @ 2007-03-22 12:01 UTC (permalink / raw)
To: Segher Boessenkool; +Cc: linuxppc-dev, Paul Mackerras
On Thu, 2007-03-22 at 23:00 +1100, Benjamin Herrenschmidt wrote:
> On Thu, 2007-03-22 at 12:55 +0100, Segher Boessenkool wrote:
> > > + * If the hash table bucket is full of kernel text entries, we'll
> > > + * lockup here but that shouldn't happen
> >
> > On 32-bit we have room for one HPTE per real page -- assuming
> > the kernel text is fully enclosed inside one power-of-two
> > size and aligned range of address space (i.e. always ;-) ),
> > a lockup here _cannot_ happen. It doesn't get tight until you
> > start using all of memory as kernel text.
>
> I know, which is why "shouldn't" is a good enough choice of word :-)
>
> > So replace this comment with something like "the HTAB is
> > sized such that no PTEG can overflow from kernel text alone"?
> > It sounds a lot less hand-wavy than "but this shouldn't happen".
Note that there is still a potential of a second entry due to the
secondary group but that wouldn't happen in practice :-)
Ben.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 2/4] powerpc: Fix 32 bits mm operations when not using BATs
2007-03-22 2:59 [PATCH 2/4] powerpc: Fix 32 bits mm operations when not using BATs Benjamin Herrenschmidt
2007-03-22 5:44 ` Benjamin Herrenschmidt
2007-03-22 11:55 ` Segher Boessenkool
@ 2007-04-18 8:58 ` Milton Miller
2 siblings, 0 replies; 6+ messages in thread
From: Milton Miller @ 2007-04-18 8:58 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: ppcdev, Paul Mackerras
In commit ee4f2ea48674b6c9d91bc854edc51a3e6a7168c4 Ben patched:
> diff --git a/arch/powerpc/mm/pgtable_32.c
> b/arch/powerpc/mm/pgtable_32.c
> index 95d3afe..f75f2fc 100644
> --- a/arch/powerpc/mm/pgtable_32.c
> +++ b/arch/powerpc/mm/pgtable_32.c
> @@ -282,16 +282,19 @@ int map_page(unsigned long va, phys_addr_t pa,
> int flags)
...
> + ktext = ((char *) v >= _stext && (char *) v < etext);
> + f = ktext ?_PAGE_RAM_TEXT : _PAGE_RAM;
> map_page(v, p, f);
> +#ifdef CONFIG_PPC_STD_MMU_32
> + if (ktext)
> + hash_preload(&init_mm, v, 0, 0x300);
> +#endif
We should preload text pages with trap 0x400, ISI, aka an
instruction storage miss, not 0x300, data storage miss, to
avoid any confusion with looking at the trap vs PAGE_EXEC.
Sorry for not catching this until it was applied.
milton
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2007-04-18 8:58 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-03-22 2:59 [PATCH 2/4] powerpc: Fix 32 bits mm operations when not using BATs Benjamin Herrenschmidt
2007-03-22 5:44 ` Benjamin Herrenschmidt
2007-03-22 11:55 ` Segher Boessenkool
2007-03-22 12:00 ` Benjamin Herrenschmidt
2007-03-22 12:01 ` Benjamin Herrenschmidt
2007-04-18 8:58 ` Milton Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).