* pte_update and 64-bit PTEs on PPC32? @ 2005-04-06 6:51 Kumar Gala 2005-04-06 6:53 ` Benjamin Herrenschmidt 0 siblings, 1 reply; 18+ messages in thread From: Kumar Gala @ 2005-04-06 6:51 UTC (permalink / raw) To: Paul Mackerras, Matt Porter, Benjamin Herrenschmidt Cc: linuxppc-dev list, linux-ppc-embedded list Paul, I've tracked down a bug I've been having to the fact that pte_update assumes a pte is a unsigned long. I need to look into what the exact implications this has. I was wondering what the thoughts were with respect to how this is suppose to work properly on 440 with its 64-bit pte? I'm looking at a 64-bit pte for some Freescale book-e parts as we move to 36-bit physical address support. The problem I found was ptep_get_and_clear() would return back only a 32-bit value and thus we loose any information in the upper 32-bits. I found the call in sys_mprotect ... -> change_pte_range -> ptep_get_and_clear() Will provide some update on this tomorrow. - kumar ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: pte_update and 64-bit PTEs on PPC32? 2005-04-06 6:51 pte_update and 64-bit PTEs on PPC32? Kumar Gala @ 2005-04-06 6:53 ` Benjamin Herrenschmidt 2005-04-06 16:44 ` Kumar Gala 0 siblings, 1 reply; 18+ messages in thread From: Benjamin Herrenschmidt @ 2005-04-06 6:53 UTC (permalink / raw) To: Kumar Gala; +Cc: linuxppc-dev list, Paul Mackerras, linux-ppc-embedded list On Wed, 2005-04-06 at 01:51 -0500, Kumar Gala wrote: > Paul, > > I've tracked down a bug I've been having to the fact that pte_update > assumes a pte is a unsigned long. I need to look into what the exact > implications this has. I was wondering what the thoughts were with > respect to how this is suppose to work properly on 440 with its 64-bit > pte? I'm looking at a 64-bit pte for some Freescale book-e parts as we > move to 36-bit physical address support. > > The problem I found was ptep_get_and_clear() would return back only a > 32-bit value and thus we loose any information in the upper 32-bits. I > found the call in sys_mprotect ... -> change_pte_range -> > ptep_get_and_clear() > > Will provide some update on this tomorrow. It's quite important for the flags to all be together in a single 32 bits entity so that atomic operations can be done on them. The RPN should be able to extend beyond the initial 32 bits provided we are careful about the way we manipulate the PTEs. When setting a PTE, we should always first set the "other" part, then the PTE present bit last or a CPU would possibly get a stale PTE. The problem with that scheme is that I can see possible races on dual page faults trying to fill in the same PTE if we drop the page table lock (christoph lameter stuff) but it should work for us now. Ben. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: pte_update and 64-bit PTEs on PPC32? 2005-04-06 6:53 ` Benjamin Herrenschmidt @ 2005-04-06 16:44 ` Kumar Gala 2005-04-06 17:20 ` Chris Friesen ` (2 more replies) 0 siblings, 3 replies; 18+ messages in thread From: Kumar Gala @ 2005-04-06 16:44 UTC (permalink / raw) To: Benjamin Herrenschmidt Cc: linuxppc-dev list, Paul Mackerras, linux-ppc-embedded list On Apr 6, 2005, at 1:53 AM, Benjamin Herrenschmidt wrote: > On Wed, 2005-04-06 at 01:51 -0500, Kumar Gala wrote: > > Paul, > > > > I've tracked down a bug I've been having to the fact that pte_update > > assumes a pte is a unsigned long.=A0 I need to look into what the = exact > > implications this has.=A0 I was wondering what the thoughts were = with > > respect to how this is suppose to work properly on 440 with its=20 > 64-bit > > pte?=A0 I'm looking at a 64-bit pte for some Freescale book-e parts = as=20 > we > > move to 36-bit physical address support. > > > > The problem I found was ptep_get_and_clear() would return back only = a > > 32-bit value and thus we loose any information in the upper=20 > 32-bits.=A0 I > > found the call in sys_mprotect ... -> change_pte_range -> > > ptep_get_and_clear() > > > > Will provide some update on this tomorrow. > > It's quite important for the flags to all be together in a single 32 > bits entity so that atomic operations can be done on them. The RPN > should be able to extend beyond the initial 32 bits provided we are > careful about the way we manipulate the PTEs. When setting a PTE, we > should always first set the "other" part, then the PTE present bit=20 > last > or a CPU would possibly get a stale PTE. The problem with that scheme=20= > is > that I can see possible races on dual page faults trying to fill in = the > same PTE if we drop the page table lock (christoph lameter stuff) but=20= > it > should work for us now. Ben, I agree with you about having the flags in a single word so we can=20= lock them properly. In the short term it appears that the issue I'm=20 running into is explicit with ptep_get_and_clear(): static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned=20 long addr, pte_t *ptep) { return __pte(pte_update(ptep, ~_PAGE_HASHPTE, 0)); } It appears that we should be returning the pte that was passed in,=20 before its modified? (seems a little silly to me, why bother, the=20 caller could do this -- i've posted to lkml on the issue?). Anyways,=20 since pte_update only returns the lower 32-bits the wrong thing=20 happens. The following seems to be a better implementation of=20 ptep_get_and_clear() for ppc32 which resolves my issue: static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned=20 long addr, pte_t *ptep) { pte_t tmp =3D *ptep; pte_update(ptep, ~_PAGE_HASHPTE, 0); return tmp; } If we are ok with this I'll send a patch upstream for it. I'd like to=20= still discuss how to make this all proper long term. Currently,=20 ptep_get_and_clear was the only user of pte_update that used the return=20= value for anything but flags. One change would be for it to return=20 just the flags portion of the pte it was given. Another would be for=20 us to implement a proper 64-bit pte version of pte_update. - kumar ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: pte_update and 64-bit PTEs on PPC32? 2005-04-06 16:44 ` Kumar Gala @ 2005-04-06 17:20 ` Chris Friesen 2005-04-06 17:58 ` Kumar Gala 2005-04-06 22:22 ` Benjamin Herrenschmidt 2005-04-07 11:15 ` Benjamin Herrenschmidt 2 siblings, 1 reply; 18+ messages in thread From: Chris Friesen @ 2005-04-06 17:20 UTC (permalink / raw) To: Kumar Gala; +Cc: linux-ppc-embedded list, Paul Mackerras, linuxppc-dev list Kumar Gala wrote: > static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned > long addr, > pte_t *ptep) > { > pte_t tmp = *ptep; > pte_update(ptep, ~_PAGE_HASHPTE, 0); > return tmp; > } Doesn't work. The update and storing of the old value must be atomic. The value of the pte can change between storing the tmp value and calling pte_update(). Chris ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: pte_update and 64-bit PTEs on PPC32? 2005-04-06 17:20 ` Chris Friesen @ 2005-04-06 17:58 ` Kumar Gala 2005-04-06 21:33 ` Kumar Gala 0 siblings, 1 reply; 18+ messages in thread From: Kumar Gala @ 2005-04-06 17:58 UTC (permalink / raw) To: Chris Friesen; +Cc: linux-ppc-embedded list, Paul Mackerras, linuxppc-dev list On Apr 6, 2005, at 12:20 PM, Chris Friesen wrote: > Kumar Gala wrote: > > > static inline pte_t ptep_get_and_clear(struct mm_struct *mm, = unsigned > > long addr, > >=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 pte_t *ptep) > > { > >=A0=A0=A0=A0=A0=A0=A0=A0 pte_t tmp =3D *ptep; > >=A0=A0=A0=A0=A0=A0=A0=A0 pte_update(ptep, ~_PAGE_HASHPTE, 0); > >=A0=A0=A0=A0=A0=A0=A0=A0 return tmp; > > } > > Doesn't work.=A0 The update and storing of the old value must be = atomic. > The value of the pte can change between storing the tmp value and > calling pte_update(). Thanks, realized this after a bit more thought, and some discussion on=20= lkml :) Guess, I need to write a 64-bit pte version of pte_update, foo! - kumar ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: pte_update and 64-bit PTEs on PPC32? 2005-04-06 17:58 ` Kumar Gala @ 2005-04-06 21:33 ` Kumar Gala 2005-04-08 8:26 ` Gabriel Paubert 0 siblings, 1 reply; 18+ messages in thread From: Kumar Gala @ 2005-04-06 21:33 UTC (permalink / raw) To: Paul Mackerras; +Cc: linuxppc-dev list, linux-ppc-embedded list Here is a version that works if CONFIG_PTE_64BIT is defined. If we like this, I can simplify the pte_update so we dont need the (unsigned long)(p+1) - 4) trick anymore. Let me know. - kumar #ifdef CONFIG_PTE_64BIT static inline unsigned long long pte_update(pte_t *p, unsigned long clr, unsigned long set) { unsigned long long old; unsigned long tmp; __asm__ __volatile__("\ 1: lwarx %L0,0,%4\n\ lwzx %0,0,%3\n\ andc %1,%L0,%5\n\ or %1,%1,%6\n\ stwcx. %1,0,%4\n\ bne- 1b" : "=&r" (old), "=&r" (tmp), "=m" (*p) : "r" (p), "r" ((unsigned long)(p) + 4), "r" (clr), "r" (set), "m" (*p) : "cc" ); return old; } #else ... code that exists today ... #endif ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: pte_update and 64-bit PTEs on PPC32? 2005-04-06 21:33 ` Kumar Gala @ 2005-04-08 8:26 ` Gabriel Paubert 2005-04-08 14:08 ` Kumar Gala 0 siblings, 1 reply; 18+ messages in thread From: Gabriel Paubert @ 2005-04-08 8:26 UTC (permalink / raw) To: Kumar Gala; +Cc: linuxppc-dev list, Paul Mackerras, linux-ppc-embedded list On Wed, Apr 06, 2005 at 04:33:14PM -0500, Kumar Gala wrote: > Here is a version that works if CONFIG_PTE_64BIT is defined. If we > like this, I can simplify the pte_update so we dont need the (unsigned > long)(p+1) - 4) trick anymore. Let me know. > > - kumar > > #ifdef CONFIG_PTE_64BIT > static inline unsigned long long pte_update(pte_t *p, unsigned long clr, > unsigned long set) > { > unsigned long long old; > unsigned long tmp; > > __asm__ __volatile__("\ > 1: lwarx %L0,0,%4\n\ > lwzx %0,0,%3\n\ > andc %1,%L0,%5\n\ > or %1,%1,%6\n\ > stwcx. %1,0,%4\n\ > bne- 1b" > : "=&r" (old), "=&r" (tmp), "=m" (*p) > : "r" (p), "r" ((unsigned long)(p) + 4), "r" (clr), "r" (set), > "m" (*p) Are you sure of your pointer arithmetic? I believe that you'd rather want to use (unsigned char)(p)+4. Or even better: :"r" (p), "b" (4), "r" (clr), "r" (set) and change the first line to: lwarx %L0,%4,%3. Even more devious, you don't need the %4 parameter: li %L0,4 lwarx %L0,%L0,%3 since %L0 cannot be r0. This saves one register. > : "cc" ); On PPC, I always prefer saying cr0 over cc. Maybe it's just me, but it's the canonical register name in the architecture. Gabriel ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: pte_update and 64-bit PTEs on PPC32? 2005-04-08 8:26 ` Gabriel Paubert @ 2005-04-08 14:08 ` Kumar Gala 2005-04-08 18:44 ` Gabriel Paubert 0 siblings, 1 reply; 18+ messages in thread From: Kumar Gala @ 2005-04-08 14:08 UTC (permalink / raw) To: Gabriel Paubert Cc: linuxppc-dev list, Paul Mackerras, linux-ppc-embedded list On Apr 8, 2005, at 3:26 AM, Gabriel Paubert wrote: > On Wed, Apr 06, 2005 at 04:33:14PM -0500, Kumar Gala wrote: > > Here is a version that works if CONFIG_PTE_64BIT is defined.=A0 If = we > > like this, I can simplify the pte_update so we dont need the=20 > (unsigned > > long)(p+1) - 4) trick anymore.=A0 Let me know. > > > > - kumar > > > > #ifdef CONFIG_PTE_64BIT > > static inline unsigned long long pte_update(pte_t *p, unsigned long=20= > clr, > >=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 unsigned long set) > > { > >=A0=A0=A0=A0=A0=A0=A0=A0 unsigned long long old; > >=A0=A0=A0=A0=A0=A0=A0=A0 unsigned long tmp; > > > >=A0=A0=A0=A0=A0=A0=A0=A0 __asm__ __volatile__("\ > > 1:=A0=A0=A0=A0=A0 lwarx=A0=A0 %L0,0,%4\n\ > >=A0=A0=A0=A0=A0=A0=A0=A0 lwzx=A0=A0=A0 %0,0,%3\n\ > >=A0=A0=A0=A0=A0=A0=A0=A0 andc=A0=A0=A0 %1,%L0,%5\n\ > >=A0=A0=A0=A0=A0=A0=A0=A0 or=A0=A0=A0=A0=A0 %1,%1,%6\n\ > >=A0=A0=A0=A0=A0=A0=A0=A0 stwcx.=A0 %1,0,%4\n\ > >=A0=A0=A0=A0=A0=A0=A0=A0 bne-=A0=A0=A0 1b" > >=A0=A0=A0=A0=A0=A0=A0=A0 : "=3D&r" (old), "=3D&r" (tmp), "=3Dm" (*p) > >=A0=A0=A0=A0=A0=A0=A0=A0 : "r" (p), "r" ((unsigned long)(p) + 4), "r" = (clr), "r"=20 > (set), > > "m" (*p) > > Are you sure of your pointer arithmetic? I believe that > you'd rather want to use (unsigned char)(p)+4. Or even better: Realize that I'm converting the pointer to an int, so its not exactly=20 normal pointer math. Was stick with the pre-existing stye. > > :"r" (p), "b" (4), "r" (clr), "r" (set) > > and change the first line to:=A0 lwarx %L0,%4,%3. > > Even more devious, you don't need the %4 parameter: > > =A0=A0=A0=A0=A0=A0=A0 li %L0,4 > =A0=A0=A0=A0=A0=A0=A0 lwarx %L0,%L0,%3 > > since %L0 cannot be r0. This saves one register. Actually the compiler effective does this for me. If you look at the=20 generated asm, the only additional instruction is an 'addi' and some=20 'mr' to handle getting things in the correct registers for the return. =20= Not really sure if there is much else to do to optimize this. > >=A0=A0=A0=A0=A0=A0=A0=A0 : "cc" ); > > On PPC, I always prefer saying cr0 over cc. Maybe it's just > me, but it's the canonical register name in the architecture. Was sticking with the style of what already existed, but I agree that=20 cr is more natural to read than cc. - kumar ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: pte_update and 64-bit PTEs on PPC32? 2005-04-08 14:08 ` Kumar Gala @ 2005-04-08 18:44 ` Gabriel Paubert 2005-04-08 19:01 ` Kumar Gala 2005-04-09 0:32 ` Paul Mackerras 0 siblings, 2 replies; 18+ messages in thread From: Gabriel Paubert @ 2005-04-08 18:44 UTC (permalink / raw) To: Kumar Gala; +Cc: linuxppc-dev list, Paul Mackerras, linux-ppc-embedded list On Fri, Apr 08, 2005 at 09:08:28AM -0500, Kumar Gala wrote: > > On Apr 8, 2005, at 3:26 AM, Gabriel Paubert wrote: > > >On Wed, Apr 06, 2005 at 04:33:14PM -0500, Kumar Gala wrote: > > > Here is a version that works if CONFIG_PTE_64BIT is defined. If we > >> like this, I can simplify the pte_update so we dont need the > >(unsigned > >> long)(p+1) - 4) trick anymore. Let me know. > > > > >> - kumar > > > > >> #ifdef CONFIG_PTE_64BIT > >> static inline unsigned long long pte_update(pte_t *p, unsigned long > >clr, > > > unsigned long set) > > > { > > > unsigned long long old; > > > unsigned long tmp; > > > > >> __asm__ __volatile__("\ > > > 1: lwarx %L0,0,%4\n\ > > > lwzx %0,0,%3\n\ > > > andc %1,%L0,%5\n\ > >> or %1,%1,%6\n\ > > > stwcx. %1,0,%4\n\ > > > bne- 1b" > > > : "=&r" (old), "=&r" (tmp), "=m" (*p) > >> : "r" (p), "r" ((unsigned long)(p) + 4), "r" (clr), "r" > >(set), > >> "m" (*p) > > > >Are you sure of your pointer arithmetic? I believe that > > you'd rather want to use (unsigned char)(p)+4. Or even better: > > Realize that I'm converting the pointer to an int, so its not exactly > normal pointer math. Was stick with the pre-existing stye. Wow, my brain saw a "*" before the closing parenthesis. > > > > >:"r" (p), "b" (4), "r" (clr), "r" (set) > > > >and change the first line to: lwarx %L0,%4,%3. > > > >Even more devious, you don't need the %4 parameter: > > > > li %L0,4 > > lwarx %L0,%L0,%3 > > > >since %L0 cannot be r0. This saves one register. > > Actually the compiler effective does this for me. If you look at the > generated asm, the only additional instruction is an 'addi' and some > 'mr' to handle getting things in the correct registers for the return. > Not really sure if there is much else to do to optimize this. Now that I read it carefully, I realize that I was wrong. But there is still some room for optimization; the parameter that you don't need is %3: simply replace lwzx %0,0,%3 by lwz %0,-4(%4). But I'm not sure that OOO cannot play tricks on you, what guarantees that the lwz is done after lwarx? Regards, Gabriel ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: pte_update and 64-bit PTEs on PPC32? 2005-04-08 18:44 ` Gabriel Paubert @ 2005-04-08 19:01 ` Kumar Gala 2005-04-08 21:04 ` Gabriel Paubert 2005-04-09 0:32 ` Paul Mackerras 1 sibling, 1 reply; 18+ messages in thread From: Kumar Gala @ 2005-04-08 19:01 UTC (permalink / raw) To: Gabriel Paubert Cc: linuxppc-dev list, Paul Mackerras, linux-ppc-embedded list On Apr 8, 2005, at 1:44 PM, Gabriel Paubert wrote: > On Fri, Apr 08, 2005 at 09:08:28AM -0500, Kumar Gala wrote: > > > > On Apr 8, 2005, at 3:26 AM, Gabriel Paubert wrote: > > > > >On Wed, Apr 06, 2005 at 04:33:14PM -0500, Kumar Gala wrote: > > > > Here is a version that works if CONFIG_PTE_64BIT is defined.=A0=20= > If we > > >> like this, I can simplify the pte_update so we dont need the > > >(unsigned > > >> long)(p+1) - 4) trick anymore.=A0 Let me know. > > > > > > >> - kumar > > > > > > >> #ifdef CONFIG_PTE_64BIT > > >> static inline unsigned long long pte_update(pte_t *p, unsigned=20 > long > > >clr, > > > >=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 unsigned long set) > > > > { > > > >=A0=A0=A0=A0=A0=A0=A0=A0 unsigned long long old; > > > >=A0=A0=A0=A0=A0=A0=A0=A0 unsigned long tmp; > > > > > > >>=A0=A0=A0=A0=A0=A0=A0=A0 __asm__ __volatile__("\ > > > > 1:=A0=A0=A0=A0=A0 lwarx=A0=A0 %L0,0,%4\n\ > > > >=A0=A0=A0=A0=A0=A0=A0=A0 lwzx=A0=A0=A0 %0,0,%3\n\ > > > >=A0=A0=A0=A0=A0=A0=A0=A0 andc=A0=A0=A0 %1,%L0,%5\n\ > > >>=A0=A0=A0=A0=A0=A0=A0=A0 or=A0=A0=A0=A0=A0 %1,%1,%6\n\ > > > >=A0=A0=A0=A0=A0=A0=A0=A0 stwcx.=A0 %1,0,%4\n\ > > > >=A0=A0=A0=A0=A0=A0=A0=A0 bne-=A0=A0=A0 1b" > > > >=A0=A0=A0=A0=A0=A0=A0=A0 : "=3D&r" (old), "=3D&r" (tmp), "=3Dm" = (*p) > > >>=A0=A0=A0=A0=A0=A0=A0=A0 : "r" (p), "r" ((unsigned long)(p) + 4), = "r" (clr), "r" > > >(set), > > >> "m" (*p) > > > > > >Are you sure of your pointer arithmetic? I believe that > > > you'd rather want to use (unsigned char)(p)+4. Or even better: > > > > Realize that I'm converting the pointer to an int, so its not = exactly > > normal pointer math.=A0 Was stick with the pre-existing stye. > > Wow, my brain saw a "*" before the closing parenthesis. > > > > > > > >:"r" (p), "b" (4), "r" (clr), "r" (set) > > > > > >and change the first line to:=A0 lwarx %L0,%4,%3. > > > > > >Even more devious, you don't need the %4 parameter: > > > > > >=A0=A0=A0=A0=A0=A0=A0 li %L0,4 > > > =A0=A0=A0=A0=A0=A0=A0 lwarx %L0,%L0,%3 > > > > > >since %L0 cannot be r0. This saves one register. > > > > Actually the compiler effective does this for me.=A0 If you look at = the > > generated asm, the only additional instruction is an 'addi' and = some > > 'mr' to handle getting things in the correct registers for the=20 > return.=A0 > > Not really sure if there is much else to do to optimize this. > > Now that I read it carefully, I realize that I was wrong. But there > is still some room for optimization; the parameter that you don't > need is %3: simply replace lwzx %0,0,%3 by lwz %0,-4(%4). Doesn't help, realize that we are going to have "r3" with a pointer to=20= pte. There is no way w/o an add to get to the next word for the lwarx. > But I'm not sure that OOO cannot play tricks on you, what guarantees > that the lwz is done after lwarx? I'm assuming since its a single asm block, gcc is not allowed to=20 reorder it. - kumar ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: pte_update and 64-bit PTEs on PPC32? 2005-04-08 19:01 ` Kumar Gala @ 2005-04-08 21:04 ` Gabriel Paubert 2005-04-08 21:31 ` Dan Malek 2005-04-08 23:32 ` Kumar Gala 0 siblings, 2 replies; 18+ messages in thread From: Gabriel Paubert @ 2005-04-08 21:04 UTC (permalink / raw) To: Kumar Gala; +Cc: linuxppc-dev list, Paul Mackerras, linux-ppc-embedded list On Fri, Apr 08, 2005 at 02:01:13PM -0500, Kumar Gala wrote: > >Now that I read it carefully, I realize that I was wrong. But there > >is still some room for optimization; the parameter that you don't > >need is %3: simply replace lwzx %0,0,%3 by lwz %0,-4(%4). > > Doesn't help, realize that we are going to have "r3" with a pointer to > pte. There is no way w/o an add to get to the next word for the lwarx. I'd have to see the context. One less parameter to an asm block may also make the compiler life easier. > > >But I'm not sure that OOO cannot play tricks on you, what guarantees > > that the lwz is done after lwarx? > > I'm assuming since its a single asm block, gcc is not allowed to > reorder it. Not GCC, but the hardware. If loads can pass loads and lwarx has more internal housekeeping overhead (obviously) than lwz. Especially in the case of a processor with 2 LSU: - lwarx issued to LSU1 - lwz issued LSU2 in the same clock cycle I'm not sure at all that that you are guaranteed not to get potentially stale data from the lwz on SMP. Loads are weekly ordered in general wrt each other and lwarx is no exception AFAIR. The fact that the two words are guaranteed to be in the same cache line makes it extremely unlikely, but not impossible. Gabriel ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: pte_update and 64-bit PTEs on PPC32? 2005-04-08 21:04 ` Gabriel Paubert @ 2005-04-08 21:31 ` Dan Malek 2005-04-08 21:44 ` Gabriel Paubert 2005-04-08 23:32 ` Kumar Gala 1 sibling, 1 reply; 18+ messages in thread From: Dan Malek @ 2005-04-08 21:31 UTC (permalink / raw) To: Gabriel Paubert Cc: linuxppc-dev list, Paul Mackerras, linux-ppc-embedded list On Apr 8, 2005, at 5:04 PM, Gabriel Paubert wrote: > ...... Loads are weekly ^^^^^^ > ordered in general wrt each other and lwarx is no exception > AFAIR. Maybe that's my performance problem :-) -- Dan ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: pte_update and 64-bit PTEs on PPC32? 2005-04-08 21:31 ` Dan Malek @ 2005-04-08 21:44 ` Gabriel Paubert 0 siblings, 0 replies; 18+ messages in thread From: Gabriel Paubert @ 2005-04-08 21:44 UTC (permalink / raw) To: Dan Malek; +Cc: linuxppc-dev list, Paul Mackerras, linux-ppc-embedded list On Fri, Apr 08, 2005 at 05:31:43PM -0400, Dan Malek wrote: > > On Apr 8, 2005, at 5:04 PM, Gabriel Paubert wrote: > > >...... Loads are weekly > ^^^^^^ > >ordered in general wrt each other and lwarx is no exception > >AFAIR. > > Maybe that's my performance problem :-) Good catch ;-) I believe it's my funniest typo to date. Gabriel ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: pte_update and 64-bit PTEs on PPC32? 2005-04-08 21:04 ` Gabriel Paubert 2005-04-08 21:31 ` Dan Malek @ 2005-04-08 23:32 ` Kumar Gala 1 sibling, 0 replies; 18+ messages in thread From: Kumar Gala @ 2005-04-08 23:32 UTC (permalink / raw) To: Gabriel Paubert Cc: linuxppc-dev list, Paul Mackerras, linux-ppc-embedded list On Apr 8, 2005, at 4:04 PM, Gabriel Paubert wrote: > On Fri, Apr 08, 2005 at 02:01:13PM -0500, Kumar Gala wrote: > > >Now that I read it carefully, I realize that I was wrong. But = there > > >is still some room for optimization; the parameter that you don't > > >need is %3: simply replace lwzx %0,0,%3 by lwz %0,-4(%4). > > > > Doesn't help, realize that we are going to have "r3" with a pointer=20= > to > > pte.=A0 There is no way w/o an add to get to the next word for the=20= > lwarx. > > I'd have to see the context. One less parameter to an asm block may > also make the compiler life easier. The only thing we could do is make the 4 a constant param and change=20 the lwarx to use it.. not sure if thats any better than what we are=20 doing. > > > > >But I'm not sure that OOO cannot play tricks on you, what = guarantees > > > that the lwz is done after lwarx? > > > > I'm assuming since its a single asm block, gcc is not allowed to > > reorder it. > > Not GCC, but the hardware. If loads can pass loads and lwarx has > more internal housekeeping overhead (obviously) than lwz. Especially > in the case of a processor with 2 LSU: > - lwarx issued to LSU1 > - lwz issued LSU2 in the same clock cycle > > I'm not sure at all that that you are guaranteed not to get > potentially stale data from the lwz on SMP. Loads are weekly > ordered in general wrt each other and lwarx is no exception > AFAIR. The fact that the two words are guaranteed to be in > the same cache line makes it extremely unlikely, but not > impossible. You are correct, I guess I really need an eieio in between the lwarx=20 and lwzx - kumar= ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: pte_update and 64-bit PTEs on PPC32? 2005-04-08 18:44 ` Gabriel Paubert 2005-04-08 19:01 ` Kumar Gala @ 2005-04-09 0:32 ` Paul Mackerras 1 sibling, 0 replies; 18+ messages in thread From: Paul Mackerras @ 2005-04-09 0:32 UTC (permalink / raw) To: Gabriel Paubert; +Cc: linuxppc-dev list, linux-ppc-embedded list Gabriel Paubert writes: > But I'm not sure that OOO cannot play tricks on you, what guarantees > that the lwz is done after lwarx? Nothing, but it doesn't matter, because we have the mm->page_table_lock, and anything that is changing the top 32 bits, or anything in the bottom 32 bits other than the _PAGE_HASHPTE bit, must also take the mm->page_table_lock. The low-level hash_page routine can change the _PAGE_HASHPTE bit without having that lock, which is why we need the atomic sequence. Paul. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: pte_update and 64-bit PTEs on PPC32? 2005-04-06 16:44 ` Kumar Gala 2005-04-06 17:20 ` Chris Friesen @ 2005-04-06 22:22 ` Benjamin Herrenschmidt 2005-04-06 22:27 ` Kumar Gala 2005-04-07 11:15 ` Benjamin Herrenschmidt 2 siblings, 1 reply; 18+ messages in thread From: Benjamin Herrenschmidt @ 2005-04-06 22:22 UTC (permalink / raw) To: Kumar Gala; +Cc: linuxppc-dev list, Paul Mackerras, linux-ppc-embedded list On Wed, 2005-04-06 at 11:44 -0500, Kumar Gala wrote: > Ben, I agree with you about having the flags in a single word so we can > lock them properly. In the short term it appears that the issue I'm > running into is explicit with ptep_get_and_clear(): > > static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned > long addr, > pte_t *ptep) > { > return __pte(pte_update(ptep, ~_PAGE_HASHPTE, 0)); > } > > It appears that we should be returning the pte that was passed in, > before its modified? (seems a little silly to me, why bother, the > caller could do this -- i've posted to lkml on the issue?). No, we should return the "old" PTE. > Anyways, > since pte_update only returns the lower 32-bits the wrong thing > happens. The following seems to be a better implementation of > ptep_get_and_clear() for ppc32 which resolves my issue: > > static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned > long addr, > pte_t *ptep) > { > pte_t tmp = *ptep; > pte_update(ptep, ~_PAGE_HASHPTE, 0); > return tmp; > } Hrm... I would still do it differently. I would read the "rpn only" part non atomically and load/clear the other half atomically. Withotu that, you may miss a bit set between the load and the update (for example, _PAGE_HASHPTE may have been put in in between). > If we are ok with this I'll send a patch upstream for it. I'd like to > still discuss how to make this all proper long term. Currently, > ptep_get_and_clear was the only user of pte_update that used the return > value for anything but flags. One change would be for it to return > just the flags portion of the pte it was given. Another would be for > us to implement a proper 64-bit pte version of pte_update. > > - kumar > -- Benjamin Herrenschmidt <benh@kernel.crashing.org> ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: pte_update and 64-bit PTEs on PPC32? 2005-04-06 22:22 ` Benjamin Herrenschmidt @ 2005-04-06 22:27 ` Kumar Gala 0 siblings, 0 replies; 18+ messages in thread From: Kumar Gala @ 2005-04-06 22:27 UTC (permalink / raw) To: Benjamin Herrenschmidt Cc: linuxppc-dev list, Paul Mackerras, linux-ppc-embedded list On Apr 6, 2005, at 5:22 PM, Benjamin Herrenschmidt wrote: > On Wed, 2005-04-06 at 11:44 -0500, Kumar Gala wrote: > > > Ben, I agree with you about having the flags in a single word so we=20= > can > > lock them properly.=A0 In the short term it appears that the issue = I'm > > running into is explicit with ptep_get_and_clear(): > > > > static inline pte_t ptep_get_and_clear(struct mm_struct *mm, = unsigned > > long addr, > >=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 pte_t *ptep) > > { > >=A0=A0=A0=A0=A0=A0=A0=A0=A0 return __pte(pte_update(ptep, = ~_PAGE_HASHPTE, 0)); > > } > > > > It appears that we should be returning the pte that was passed in, > > before its modified?=A0 (seems a little silly to me, why bother, = the > > caller could do this -- i've posted to lkml on the issue?). > > No, we should return the "old" PTE. > > >=A0=A0 Anyways, > > since pte_update only returns the lower 32-bits the wrong thing > > happens.=A0 The following seems to be a better implementation of > > ptep_get_and_clear() for ppc32 which resolves my issue: > > > > static inline pte_t ptep_get_and_clear(struct mm_struct *mm, = unsigned > > long addr, > >=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 pte_t *ptep) > > { > >=A0=A0=A0=A0=A0=A0=A0=A0=A0 pte_t tmp =3D *ptep; > >=A0=A0=A0=A0=A0=A0=A0=A0=A0 pte_update(ptep, ~_PAGE_HASHPTE, 0); > >=A0=A0=A0=A0=A0=A0=A0=A0=A0 return tmp; > > } > > Hrm... I would still do it differently. I would read the "rpn only"=20 > part > non atomically and load/clear the other half atomically. Withotu = that, > you may miss a bit set between the load and the update (for example, > _PAGE_HASHPTE may have been put in in between). Ben, I'll let you catch up with the rest of the thread, I realized most of=20 what you have here and created a new implementation of pte_update that=20= should be doing what you are asking for. - kumar ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: pte_update and 64-bit PTEs on PPC32? 2005-04-06 16:44 ` Kumar Gala 2005-04-06 17:20 ` Chris Friesen 2005-04-06 22:22 ` Benjamin Herrenschmidt @ 2005-04-07 11:15 ` Benjamin Herrenschmidt 2 siblings, 0 replies; 18+ messages in thread From: Benjamin Herrenschmidt @ 2005-04-07 11:15 UTC (permalink / raw) To: Kumar Gala; +Cc: linuxppc-dev list, Paul Mackerras, linux-ppc-embedded list > Ben, I agree with you about having the flags in a single word so we can > lock them properly. In the short term it appears that the issue I'm > running into is explicit with ptep_get_and_clear(): > > static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned > long addr, > pte_t *ptep) > { > return __pte(pte_update(ptep, ~_PAGE_HASHPTE, 0)); > } > > It appears that we should be returning the pte that was passed in, > before its modified? (seems a little silly to me, why bother, the > caller could do this No, we should return the PTE that was there before the change atomically. It can't be done by the caller as if the caller was just reading the PTE, there would be a race where the PTE could still change between what you read and what you update. It has to be an atomic operation. > -- i've posted to lkml on the issue?). Anyways, > since pte_update only returns the lower 32-bits the wrong thing > happens. The following seems to be a better implementation of > ptep_get_and_clear() for ppc32 which resolves my issue: > > static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned > long addr, > pte_t *ptep) > { > pte_t tmp = *ptep; > pte_update(ptep, ~_PAGE_HASHPTE, 0); > return tmp; > } > > If we are ok with this I'll send a patch upstream for it. I'd like to > still discuss how to make this all proper long term. Currently, > ptep_get_and_clear was the only user of pte_update that used the return > value for anything but flags. One change would be for it to return > just the flags portion of the pte it was given. Another would be for > us to implement a proper 64-bit pte version of pte_update. > > - kumar > > _______________________________________________ > Linuxppc-embedded mailing list > Linuxppc-embedded@ozlabs.org > https://ozlabs.org/mailman/listinfo/linuxppc-embedded -- Benjamin Herrenschmidt <benh@kernel.crashing.org> ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2005-04-09 0:32 UTC | newest] Thread overview: 18+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2005-04-06 6:51 pte_update and 64-bit PTEs on PPC32? Kumar Gala 2005-04-06 6:53 ` Benjamin Herrenschmidt 2005-04-06 16:44 ` Kumar Gala 2005-04-06 17:20 ` Chris Friesen 2005-04-06 17:58 ` Kumar Gala 2005-04-06 21:33 ` Kumar Gala 2005-04-08 8:26 ` Gabriel Paubert 2005-04-08 14:08 ` Kumar Gala 2005-04-08 18:44 ` Gabriel Paubert 2005-04-08 19:01 ` Kumar Gala 2005-04-08 21:04 ` Gabriel Paubert 2005-04-08 21:31 ` Dan Malek 2005-04-08 21:44 ` Gabriel Paubert 2005-04-08 23:32 ` Kumar Gala 2005-04-09 0:32 ` Paul Mackerras 2005-04-06 22:22 ` Benjamin Herrenschmidt 2005-04-06 22:27 ` Kumar Gala 2005-04-07 11:15 ` Benjamin Herrenschmidt
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).