linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* pte_update and 64-bit PTEs on PPC32?
@ 2005-04-06  6:51 Kumar Gala
  2005-04-06  6:53 ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 18+ messages in thread
From: Kumar Gala @ 2005-04-06  6:51 UTC (permalink / raw)
  To: Paul Mackerras, Matt Porter, Benjamin Herrenschmidt
  Cc: linuxppc-dev list, linux-ppc-embedded list

Paul,

I've tracked down a bug I've been having to the fact that pte_update 
assumes a pte is a unsigned long.  I need to look into what the exact 
implications this has.  I was wondering what the thoughts were with 
respect to how this is suppose to work properly on 440 with its 64-bit 
pte?  I'm looking at a 64-bit pte for some Freescale book-e parts as we 
move to 36-bit physical address support.

The problem I found was ptep_get_and_clear() would return back only a 
32-bit value and thus we loose any information in the upper 32-bits.  I 
found the call in sys_mprotect ... -> change_pte_range -> 
ptep_get_and_clear()

Will provide some update on this tomorrow.

- kumar

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: pte_update and 64-bit PTEs on PPC32?
  2005-04-06  6:51 pte_update and 64-bit PTEs on PPC32? Kumar Gala
@ 2005-04-06  6:53 ` Benjamin Herrenschmidt
  2005-04-06 16:44   ` Kumar Gala
  0 siblings, 1 reply; 18+ messages in thread
From: Benjamin Herrenschmidt @ 2005-04-06  6:53 UTC (permalink / raw)
  To: Kumar Gala; +Cc: linuxppc-dev list, Paul Mackerras, linux-ppc-embedded list

On Wed, 2005-04-06 at 01:51 -0500, Kumar Gala wrote:
> Paul,
> 
> I've tracked down a bug I've been having to the fact that pte_update 
> assumes a pte is a unsigned long.  I need to look into what the exact 
> implications this has.  I was wondering what the thoughts were with 
> respect to how this is suppose to work properly on 440 with its 64-bit 
> pte?  I'm looking at a 64-bit pte for some Freescale book-e parts as we 
> move to 36-bit physical address support.
> 
> The problem I found was ptep_get_and_clear() would return back only a 
> 32-bit value and thus we loose any information in the upper 32-bits.  I 
> found the call in sys_mprotect ... -> change_pte_range -> 
> ptep_get_and_clear()
> 
> Will provide some update on this tomorrow.

It's quite important for the flags to all be together in a single 32
bits entity so that atomic operations can be done on them. The RPN
should be able to extend beyond the initial 32 bits provided we are
careful about the way we manipulate the PTEs. When setting a PTE, we
should always first set the "other" part, then the PTE present bit last
or a CPU would possibly get a stale PTE. The problem with that scheme is
that I can see possible races on dual page faults trying to fill in the
same PTE if we drop the page table lock (christoph lameter stuff) but it
should work for us now.

Ben.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: pte_update and 64-bit PTEs on PPC32?
  2005-04-06  6:53 ` Benjamin Herrenschmidt
@ 2005-04-06 16:44   ` Kumar Gala
  2005-04-06 17:20     ` Chris Friesen
                       ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: Kumar Gala @ 2005-04-06 16:44 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: linuxppc-dev list, Paul Mackerras, linux-ppc-embedded list


On Apr 6, 2005, at 1:53 AM, Benjamin Herrenschmidt wrote:

> On Wed, 2005-04-06 at 01:51 -0500, Kumar Gala wrote:
>  > Paul,
>  >
> > I've tracked down a bug I've been having to the fact that pte_update
> > assumes a pte is a unsigned long.=A0 I need to look into what the =
exact
>  > implications this has.=A0 I was wondering what the thoughts were =
with
> > respect to how this is suppose to work properly on 440 with its=20
> 64-bit
> > pte?=A0 I'm looking at a 64-bit pte for some Freescale book-e parts =
as=20
> we
> > move to 36-bit physical address support.
>  >
> > The problem I found was ptep_get_and_clear() would return back only =
a
> > 32-bit value and thus we loose any information in the upper=20
> 32-bits.=A0 I
> > found the call in sys_mprotect ... -> change_pte_range ->
> > ptep_get_and_clear()
> >
> > Will provide some update on this tomorrow.
>
> It's quite important for the flags to all be together in a single 32
>  bits entity so that atomic operations can be done on them. The RPN
>  should be able to extend beyond the initial 32 bits provided we are
>  careful about the way we manipulate the PTEs. When setting a PTE, we
>  should always first set the "other" part, then the PTE present bit=20
> last
> or a CPU would possibly get a stale PTE. The problem with that scheme=20=

> is
> that I can see possible races on dual page faults trying to fill in =
the
>  same PTE if we drop the page table lock (christoph lameter stuff) but=20=

> it
>  should work for us now.

Ben, I agree with you about having the flags in a single word so we can=20=

lock them properly.  In the short term it appears that the issue I'm=20
running into is explicit with ptep_get_and_clear():

static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned=20
long addr,
                                        pte_t *ptep)
{
         return __pte(pte_update(ptep, ~_PAGE_HASHPTE, 0));
}

It appears that we should be returning the pte that was passed in,=20
before its modified?  (seems a little silly to me, why bother, the=20
caller could do this -- i've posted to lkml on the issue?).  Anyways,=20
since pte_update only returns the lower 32-bits the wrong thing=20
happens.  The following seems to be a better implementation of=20
ptep_get_and_clear() for ppc32 which resolves my issue:

static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned=20
long addr,
                                        pte_t *ptep)
{
         pte_t tmp =3D *ptep;
         pte_update(ptep, ~_PAGE_HASHPTE, 0);
         return tmp;
}

If we are ok with this I'll send a patch upstream for it.  I'd like to=20=

still discuss how to make this all proper long term.  Currently,=20
ptep_get_and_clear was the only user of pte_update that used the return=20=

value for anything but flags.  One change would be for it to return=20
just the flags portion of the pte it was given.  Another would be for=20
us to implement a proper 64-bit pte version of pte_update.

- kumar

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: pte_update and 64-bit PTEs on PPC32?
  2005-04-06 16:44   ` Kumar Gala
@ 2005-04-06 17:20     ` Chris Friesen
  2005-04-06 17:58       ` Kumar Gala
  2005-04-06 22:22     ` Benjamin Herrenschmidt
  2005-04-07 11:15     ` Benjamin Herrenschmidt
  2 siblings, 1 reply; 18+ messages in thread
From: Chris Friesen @ 2005-04-06 17:20 UTC (permalink / raw)
  To: Kumar Gala; +Cc: linux-ppc-embedded list, Paul Mackerras, linuxppc-dev list

Kumar Gala wrote:

> static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned 
> long addr,
>                                        pte_t *ptep)
> {
>         pte_t tmp = *ptep;
>         pte_update(ptep, ~_PAGE_HASHPTE, 0);
>         return tmp;
> }

Doesn't work.  The update and storing of the old value must be atomic. 
The value of the pte can change between storing the tmp value and 
calling pte_update().

Chris

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: pte_update and 64-bit PTEs on PPC32?
  2005-04-06 17:20     ` Chris Friesen
@ 2005-04-06 17:58       ` Kumar Gala
  2005-04-06 21:33         ` Kumar Gala
  0 siblings, 1 reply; 18+ messages in thread
From: Kumar Gala @ 2005-04-06 17:58 UTC (permalink / raw)
  To: Chris Friesen; +Cc: linux-ppc-embedded list, Paul Mackerras, linuxppc-dev list

On Apr 6, 2005, at 12:20 PM, Chris Friesen wrote:

> Kumar Gala wrote:
>
> > static inline pte_t ptep_get_and_clear(struct mm_struct *mm, =
unsigned
>  > long addr,
>  >=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 pte_t *ptep)
>  > {
>  >=A0=A0=A0=A0=A0=A0=A0=A0 pte_t tmp =3D *ptep;
>  >=A0=A0=A0=A0=A0=A0=A0=A0 pte_update(ptep, ~_PAGE_HASHPTE, 0);
>  >=A0=A0=A0=A0=A0=A0=A0=A0 return tmp;
>  > }
>
> Doesn't work.=A0 The update and storing of the old value must be =
atomic.
> The value of the pte can change between storing the tmp value and
> calling pte_update().

Thanks, realized this after a bit more thought, and some discussion on=20=

lkml :)

Guess, I need to write a 64-bit pte version of pte_update, foo!

- kumar

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: pte_update and 64-bit PTEs on PPC32?
  2005-04-06 17:58       ` Kumar Gala
@ 2005-04-06 21:33         ` Kumar Gala
  2005-04-08  8:26           ` Gabriel Paubert
  0 siblings, 1 reply; 18+ messages in thread
From: Kumar Gala @ 2005-04-06 21:33 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: linuxppc-dev list, linux-ppc-embedded list

Here is a version that works if CONFIG_PTE_64BIT is defined.  If we 
like this, I can simplify the pte_update so we dont need the (unsigned 
long)(p+1) - 4) trick anymore.  Let me know.

- kumar

#ifdef CONFIG_PTE_64BIT
static inline unsigned long long pte_update(pte_t *p, unsigned long clr,
                                        unsigned long set)
{
         unsigned long long old;
         unsigned long tmp;

         __asm__ __volatile__("\
1:      lwarx   %L0,0,%4\n\
         lwzx    %0,0,%3\n\
         andc    %1,%L0,%5\n\
         or      %1,%1,%6\n\
         stwcx.  %1,0,%4\n\
         bne-    1b"
         : "=&r" (old), "=&r" (tmp), "=m" (*p)
         : "r" (p), "r" ((unsigned long)(p) + 4), "r" (clr), "r" (set), 
"m" (*p)
         : "cc" );
         return old;
}
#else
... code that exists today ...
#endif

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: pte_update and 64-bit PTEs on PPC32?
  2005-04-06 16:44   ` Kumar Gala
  2005-04-06 17:20     ` Chris Friesen
@ 2005-04-06 22:22     ` Benjamin Herrenschmidt
  2005-04-06 22:27       ` Kumar Gala
  2005-04-07 11:15     ` Benjamin Herrenschmidt
  2 siblings, 1 reply; 18+ messages in thread
From: Benjamin Herrenschmidt @ 2005-04-06 22:22 UTC (permalink / raw)
  To: Kumar Gala; +Cc: linuxppc-dev list, Paul Mackerras, linux-ppc-embedded list

On Wed, 2005-04-06 at 11:44 -0500, Kumar Gala wrote:

> Ben, I agree with you about having the flags in a single word so we can 
> lock them properly.  In the short term it appears that the issue I'm 
> running into is explicit with ptep_get_and_clear():
> 
> static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned 
> long addr,
>                                         pte_t *ptep)
> {
>          return __pte(pte_update(ptep, ~_PAGE_HASHPTE, 0));
> }
> 
> It appears that we should be returning the pte that was passed in, 
> before its modified?  (seems a little silly to me, why bother, the 
> caller could do this -- i've posted to lkml on the issue?).

No, we should return the "old" PTE.

>   Anyways, 
> since pte_update only returns the lower 32-bits the wrong thing 
> happens.  The following seems to be a better implementation of 
> ptep_get_and_clear() for ppc32 which resolves my issue:
> 
> static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned 
> long addr,
>                                         pte_t *ptep)
> {
>          pte_t tmp = *ptep;
>          pte_update(ptep, ~_PAGE_HASHPTE, 0);
>          return tmp;
> }

Hrm... I would still do it differently. I would read the "rpn only" part
non atomically and load/clear the other half atomically. Withotu that,
you may miss a bit set between the load and the update (for example,
_PAGE_HASHPTE may have been put in in between).

> If we are ok with this I'll send a patch upstream for it.  I'd like to 
> still discuss how to make this all proper long term.  Currently, 
> ptep_get_and_clear was the only user of pte_update that used the return 
> value for anything but flags.  One change would be for it to return 
> just the flags portion of the pte it was given.  Another would be for 
> us to implement a proper 64-bit pte version of pte_update.
> 
> - kumar
> 
-- 
Benjamin Herrenschmidt <benh@kernel.crashing.org>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: pte_update and 64-bit PTEs on PPC32?
  2005-04-06 22:22     ` Benjamin Herrenschmidt
@ 2005-04-06 22:27       ` Kumar Gala
  0 siblings, 0 replies; 18+ messages in thread
From: Kumar Gala @ 2005-04-06 22:27 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: linuxppc-dev list, Paul Mackerras, linux-ppc-embedded list


On Apr 6, 2005, at 5:22 PM, Benjamin Herrenschmidt wrote:

> On Wed, 2005-04-06 at 11:44 -0500, Kumar Gala wrote:
>
> > Ben, I agree with you about having the flags in a single word so we=20=

> can
> > lock them properly.=A0 In the short term it appears that the issue =
I'm
> > running into is explicit with ptep_get_and_clear():
> >
> > static inline pte_t ptep_get_and_clear(struct mm_struct *mm, =
unsigned
>  > long addr,
>  >=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 pte_t *ptep)
>  > {
>  >=A0=A0=A0=A0=A0=A0=A0=A0=A0 return __pte(pte_update(ptep, =
~_PAGE_HASHPTE, 0));
>  > }
>  >
> > It appears that we should be returning the pte that was passed in,
>  > before its modified?=A0 (seems a little silly to me, why bother, =
the
> > caller could do this -- i've posted to lkml on the issue?).
>
> No, we should return the "old" PTE.
>
> >=A0=A0 Anyways,
> > since pte_update only returns the lower 32-bits the wrong thing
> > happens.=A0 The following seems to be a better implementation of
> > ptep_get_and_clear() for ppc32 which resolves my issue:
>  >
> > static inline pte_t ptep_get_and_clear(struct mm_struct *mm, =
unsigned
>  > long addr,
>  >=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 pte_t *ptep)
>  > {
>  >=A0=A0=A0=A0=A0=A0=A0=A0=A0 pte_t tmp =3D *ptep;
>  >=A0=A0=A0=A0=A0=A0=A0=A0=A0 pte_update(ptep, ~_PAGE_HASHPTE, 0);
>  >=A0=A0=A0=A0=A0=A0=A0=A0=A0 return tmp;
>  > }
>
> Hrm... I would still do it differently. I would read the "rpn only"=20
> part
>  non atomically and load/clear the other half atomically. Withotu =
that,
>  you may miss a bit set between the load and the update (for example,
>  _PAGE_HASHPTE may have been put in in between).

Ben,

I'll let you catch up with the rest of the thread, I realized most of=20
what you have here and created a new implementation of pte_update that=20=

should be doing what you are asking for.

- kumar

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: pte_update and 64-bit PTEs on PPC32?
  2005-04-06 16:44   ` Kumar Gala
  2005-04-06 17:20     ` Chris Friesen
  2005-04-06 22:22     ` Benjamin Herrenschmidt
@ 2005-04-07 11:15     ` Benjamin Herrenschmidt
  2 siblings, 0 replies; 18+ messages in thread
From: Benjamin Herrenschmidt @ 2005-04-07 11:15 UTC (permalink / raw)
  To: Kumar Gala; +Cc: linuxppc-dev list, Paul Mackerras, linux-ppc-embedded list


> Ben, I agree with you about having the flags in a single word so we can 
> lock them properly.  In the short term it appears that the issue I'm 
> running into is explicit with ptep_get_and_clear():
> 
> static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned 
> long addr,
>                                         pte_t *ptep)
> {
>          return __pte(pte_update(ptep, ~_PAGE_HASHPTE, 0));
> }
> 
> It appears that we should be returning the pte that was passed in, 
> before its modified?  (seems a little silly to me, why bother, the 
> caller could do this 

No, we should return the PTE that was there before the change
atomically. It can't be done by the caller as if the caller was just
reading the PTE, there would be a race where the PTE could still change
between what you read and what you update. It has to be an atomic
operation.

> -- i've posted to lkml on the issue?).  Anyways, 
> since pte_update only returns the lower 32-bits the wrong thing 
> happens.  The following seems to be a better implementation of 
> ptep_get_and_clear() for ppc32 which resolves my issue:
> 
> static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned 
> long addr,
>                                         pte_t *ptep)
> {
>          pte_t tmp = *ptep;
>          pte_update(ptep, ~_PAGE_HASHPTE, 0);
>          return tmp;
> }
> 
> If we are ok with this I'll send a patch upstream for it.  I'd like to 
> still discuss how to make this all proper long term.  Currently, 
> ptep_get_and_clear was the only user of pte_update that used the return 
> value for anything but flags.  One change would be for it to return 
> just the flags portion of the pte it was given.  Another would be for 
> us to implement a proper 64-bit pte version of pte_update.
> 
> - kumar
> 
> _______________________________________________
> Linuxppc-embedded mailing list
> Linuxppc-embedded@ozlabs.org
> https://ozlabs.org/mailman/listinfo/linuxppc-embedded
-- 
Benjamin Herrenschmidt <benh@kernel.crashing.org>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: pte_update and 64-bit PTEs on PPC32?
  2005-04-06 21:33         ` Kumar Gala
@ 2005-04-08  8:26           ` Gabriel Paubert
  2005-04-08 14:08             ` Kumar Gala
  0 siblings, 1 reply; 18+ messages in thread
From: Gabriel Paubert @ 2005-04-08  8:26 UTC (permalink / raw)
  To: Kumar Gala; +Cc: linuxppc-dev list, Paul Mackerras, linux-ppc-embedded list

On Wed, Apr 06, 2005 at 04:33:14PM -0500, Kumar Gala wrote:
> Here is a version that works if CONFIG_PTE_64BIT is defined.  If we 
> like this, I can simplify the pte_update so we dont need the (unsigned 
> long)(p+1) - 4) trick anymore.  Let me know.
> 
> - kumar
> 
> #ifdef CONFIG_PTE_64BIT
> static inline unsigned long long pte_update(pte_t *p, unsigned long clr,
>                                        unsigned long set)
> {
>         unsigned long long old;
>         unsigned long tmp;
> 
>         __asm__ __volatile__("\
> 1:      lwarx   %L0,0,%4\n\
>         lwzx    %0,0,%3\n\
>         andc    %1,%L0,%5\n\
>         or      %1,%1,%6\n\
>         stwcx.  %1,0,%4\n\
>         bne-    1b"
>         : "=&r" (old), "=&r" (tmp), "=m" (*p)
>         : "r" (p), "r" ((unsigned long)(p) + 4), "r" (clr), "r" (set), 
> "m" (*p)

Are you sure of your pointer arithmetic? I believe that
you'd rather want to use (unsigned char)(p)+4. Or even better:

:"r" (p), "b" (4), "r" (clr), "r" (set)

and change the first line to:  lwarx %L0,%4,%3.

Even more devious, you don't need the %4 parameter:

	li %L0,4
	lwarx %L0,%L0,%3

since %L0 cannot be r0. This saves one register.

>         : "cc" );

On PPC, I always prefer saying cr0 over cc. Maybe it's just
me, but it's the canonical register name in the architecture.

	Gabriel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: pte_update and 64-bit PTEs on PPC32?
  2005-04-08  8:26           ` Gabriel Paubert
@ 2005-04-08 14:08             ` Kumar Gala
  2005-04-08 18:44               ` Gabriel Paubert
  0 siblings, 1 reply; 18+ messages in thread
From: Kumar Gala @ 2005-04-08 14:08 UTC (permalink / raw)
  To: Gabriel Paubert
  Cc: linuxppc-dev list, Paul Mackerras, linux-ppc-embedded list


On Apr 8, 2005, at 3:26 AM, Gabriel Paubert wrote:

> On Wed, Apr 06, 2005 at 04:33:14PM -0500, Kumar Gala wrote:
>  > Here is a version that works if CONFIG_PTE_64BIT is defined.=A0 If =
we
> > like this, I can simplify the pte_update so we dont need the=20
> (unsigned
> > long)(p+1) - 4) trick anymore.=A0 Let me know.
>  >
> > - kumar
>  >
> > #ifdef CONFIG_PTE_64BIT
> > static inline unsigned long long pte_update(pte_t *p, unsigned long=20=

> clr,
>  >=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 unsigned long set)
>  > {
>  >=A0=A0=A0=A0=A0=A0=A0=A0 unsigned long long old;
>  >=A0=A0=A0=A0=A0=A0=A0=A0 unsigned long tmp;
>  >
> >=A0=A0=A0=A0=A0=A0=A0=A0 __asm__ __volatile__("\
>  > 1:=A0=A0=A0=A0=A0 lwarx=A0=A0 %L0,0,%4\n\
>  >=A0=A0=A0=A0=A0=A0=A0=A0 lwzx=A0=A0=A0 %0,0,%3\n\
>  >=A0=A0=A0=A0=A0=A0=A0=A0 andc=A0=A0=A0 %1,%L0,%5\n\
> >=A0=A0=A0=A0=A0=A0=A0=A0 or=A0=A0=A0=A0=A0 %1,%1,%6\n\
>  >=A0=A0=A0=A0=A0=A0=A0=A0 stwcx.=A0 %1,0,%4\n\
>  >=A0=A0=A0=A0=A0=A0=A0=A0 bne-=A0=A0=A0 1b"
>  >=A0=A0=A0=A0=A0=A0=A0=A0 : "=3D&r" (old), "=3D&r" (tmp), "=3Dm" (*p)
> >=A0=A0=A0=A0=A0=A0=A0=A0 : "r" (p), "r" ((unsigned long)(p) + 4), "r" =
(clr), "r"=20
> (set),
> > "m" (*p)
>
> Are you sure of your pointer arithmetic? I believe that
>  you'd rather want to use (unsigned char)(p)+4. Or even better:

Realize that I'm converting the pointer to an int, so its not exactly=20
normal pointer math.  Was stick with the pre-existing stye.

>
> :"r" (p), "b" (4), "r" (clr), "r" (set)
>
> and change the first line to:=A0 lwarx %L0,%4,%3.
>
> Even more devious, you don't need the %4 parameter:
>
> =A0=A0=A0=A0=A0=A0=A0 li %L0,4
>  =A0=A0=A0=A0=A0=A0=A0 lwarx %L0,%L0,%3
>
> since %L0 cannot be r0. This saves one register.

Actually the compiler effective does this for me.  If you look at the=20
generated asm, the only additional instruction is an 'addi' and some=20
'mr' to handle getting things in the correct registers for the return. =20=

Not really sure if there is much else to do to optimize this.

>  >=A0=A0=A0=A0=A0=A0=A0=A0 : "cc" );
>
> On PPC, I always prefer saying cr0 over cc. Maybe it's just
>  me, but it's the canonical register name in the architecture.

Was sticking with the style of what already existed, but I agree that=20
cr is more natural to read than cc.

- kumar

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: pte_update and 64-bit PTEs on PPC32?
  2005-04-08 14:08             ` Kumar Gala
@ 2005-04-08 18:44               ` Gabriel Paubert
  2005-04-08 19:01                 ` Kumar Gala
  2005-04-09  0:32                 ` Paul Mackerras
  0 siblings, 2 replies; 18+ messages in thread
From: Gabriel Paubert @ 2005-04-08 18:44 UTC (permalink / raw)
  To: Kumar Gala; +Cc: linuxppc-dev list, Paul Mackerras, linux-ppc-embedded list

On Fri, Apr 08, 2005 at 09:08:28AM -0500, Kumar Gala wrote:
> 
> On Apr 8, 2005, at 3:26 AM, Gabriel Paubert wrote:
> 
> >On Wed, Apr 06, 2005 at 04:33:14PM -0500, Kumar Gala wrote:
> > > Here is a version that works if CONFIG_PTE_64BIT is defined.  If we
> >> like this, I can simplify the pte_update so we dont need the 
> >(unsigned
> >> long)(p+1) - 4) trick anymore.  Let me know.
> > >
> >> - kumar
> > >
> >> #ifdef CONFIG_PTE_64BIT
> >> static inline unsigned long long pte_update(pte_t *p, unsigned long 
> >clr,
> > >                                        unsigned long set)
> > > {
> > >         unsigned long long old;
> > >         unsigned long tmp;
> > >
> >>         __asm__ __volatile__("\
> > > 1:      lwarx   %L0,0,%4\n\
> > >         lwzx    %0,0,%3\n\
> > >         andc    %1,%L0,%5\n\
> >>         or      %1,%1,%6\n\
> > >         stwcx.  %1,0,%4\n\
> > >         bne-    1b"
> > >         : "=&r" (old), "=&r" (tmp), "=m" (*p)
> >>         : "r" (p), "r" ((unsigned long)(p) + 4), "r" (clr), "r" 
> >(set),
> >> "m" (*p)
> >
> >Are you sure of your pointer arithmetic? I believe that
> > you'd rather want to use (unsigned char)(p)+4. Or even better:
> 
> Realize that I'm converting the pointer to an int, so its not exactly 
> normal pointer math.  Was stick with the pre-existing stye.

Wow, my brain saw a "*" before the closing parenthesis. 
> 
> >
> >:"r" (p), "b" (4), "r" (clr), "r" (set)
> >
> >and change the first line to:  lwarx %L0,%4,%3.
> >
> >Even more devious, you don't need the %4 parameter:
> >
> >        li %L0,4
> >         lwarx %L0,%L0,%3
> >
> >since %L0 cannot be r0. This saves one register.
> 
> Actually the compiler effective does this for me.  If you look at the 
> generated asm, the only additional instruction is an 'addi' and some 
> 'mr' to handle getting things in the correct registers for the return.  
> Not really sure if there is much else to do to optimize this.

Now that I read it carefully, I realize that I was wrong. But there 
is still some room for optimization; the parameter that you don't 
need is %3: simply replace lwzx %0,0,%3 by lwz %0,-4(%4).

But I'm not sure that OOO cannot play tricks on you, what guarantees
that the lwz is done after lwarx?

	Regards,
	Gabriel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: pte_update and 64-bit PTEs on PPC32?
  2005-04-08 18:44               ` Gabriel Paubert
@ 2005-04-08 19:01                 ` Kumar Gala
  2005-04-08 21:04                   ` Gabriel Paubert
  2005-04-09  0:32                 ` Paul Mackerras
  1 sibling, 1 reply; 18+ messages in thread
From: Kumar Gala @ 2005-04-08 19:01 UTC (permalink / raw)
  To: Gabriel Paubert
  Cc: linuxppc-dev list, Paul Mackerras, linux-ppc-embedded list


On Apr 8, 2005, at 1:44 PM, Gabriel Paubert wrote:

> On Fri, Apr 08, 2005 at 09:08:28AM -0500, Kumar Gala wrote:
>  >
> > On Apr 8, 2005, at 3:26 AM, Gabriel Paubert wrote:
> >
> > >On Wed, Apr 06, 2005 at 04:33:14PM -0500, Kumar Gala wrote:
>  > > > Here is a version that works if CONFIG_PTE_64BIT is defined.=A0=20=

> If we
>  > >> like this, I can simplify the pte_update so we dont need the
> > >(unsigned
>  > >> long)(p+1) - 4) trick anymore.=A0 Let me know.
>  > > >
>  > >> - kumar
>  > > >
>  > >> #ifdef CONFIG_PTE_64BIT
> > >> static inline unsigned long long pte_update(pte_t *p, unsigned=20
> long
> > >clr,
>  > > >=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 unsigned long set)
>  > > > {
>  > > >=A0=A0=A0=A0=A0=A0=A0=A0 unsigned long long old;
>  > > >=A0=A0=A0=A0=A0=A0=A0=A0 unsigned long tmp;
>  > > >
>  > >>=A0=A0=A0=A0=A0=A0=A0=A0 __asm__ __volatile__("\
>  > > > 1:=A0=A0=A0=A0=A0 lwarx=A0=A0 %L0,0,%4\n\
>  > > >=A0=A0=A0=A0=A0=A0=A0=A0 lwzx=A0=A0=A0 %0,0,%3\n\
>  > > >=A0=A0=A0=A0=A0=A0=A0=A0 andc=A0=A0=A0 %1,%L0,%5\n\
> > >>=A0=A0=A0=A0=A0=A0=A0=A0 or=A0=A0=A0=A0=A0 %1,%1,%6\n\
>  > > >=A0=A0=A0=A0=A0=A0=A0=A0 stwcx.=A0 %1,0,%4\n\
>  > > >=A0=A0=A0=A0=A0=A0=A0=A0 bne-=A0=A0=A0 1b"
>  > > >=A0=A0=A0=A0=A0=A0=A0=A0 : "=3D&r" (old), "=3D&r" (tmp), "=3Dm" =
(*p)
>  > >>=A0=A0=A0=A0=A0=A0=A0=A0 : "r" (p), "r" ((unsigned long)(p) + 4), =
"r" (clr), "r"
> > >(set),
>  > >> "m" (*p)
>  > >
>  > >Are you sure of your pointer arithmetic? I believe that
>  > > you'd rather want to use (unsigned char)(p)+4. Or even better:
>  >
> > Realize that I'm converting the pointer to an int, so its not =
exactly
> > normal pointer math.=A0 Was stick with the pre-existing stye.
>
> Wow, my brain saw a "*" before the closing parenthesis.
> >
> > >
>  > >:"r" (p), "b" (4), "r" (clr), "r" (set)
>  > >
>  > >and change the first line to:=A0 lwarx %L0,%4,%3.
>  > >
>  > >Even more devious, you don't need the %4 parameter:
> > >
>  > >=A0=A0=A0=A0=A0=A0=A0 li %L0,4
>  > > =A0=A0=A0=A0=A0=A0=A0 lwarx %L0,%L0,%3
>  > >
>  > >since %L0 cannot be r0. This saves one register.
>  >
> > Actually the compiler effective does this for me.=A0 If you look at =
the
>  > generated asm, the only additional instruction is an 'addi' and =
some
> > 'mr' to handle getting things in the correct registers for the=20
> return.=A0
> > Not really sure if there is much else to do to optimize this.
>
> Now that I read it carefully, I realize that I was wrong. But there
> is still some room for optimization; the parameter that you don't
> need is %3: simply replace lwzx %0,0,%3 by lwz %0,-4(%4).

Doesn't help, realize that we are going to have "r3" with a pointer to=20=

pte.  There is no way w/o an add to get to the next word for the lwarx.

> But I'm not sure that OOO cannot play tricks on you, what guarantees
>  that the lwz is done after lwarx?

I'm assuming since its a single asm block, gcc is not allowed to=20
reorder it.

- kumar

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: pte_update and 64-bit PTEs on PPC32?
  2005-04-08 19:01                 ` Kumar Gala
@ 2005-04-08 21:04                   ` Gabriel Paubert
  2005-04-08 21:31                     ` Dan Malek
  2005-04-08 23:32                     ` Kumar Gala
  0 siblings, 2 replies; 18+ messages in thread
From: Gabriel Paubert @ 2005-04-08 21:04 UTC (permalink / raw)
  To: Kumar Gala; +Cc: linuxppc-dev list, Paul Mackerras, linux-ppc-embedded list

On Fri, Apr 08, 2005 at 02:01:13PM -0500, Kumar Gala wrote:
> >Now that I read it carefully, I realize that I was wrong. But there
> >is still some room for optimization; the parameter that you don't
> >need is %3: simply replace lwzx %0,0,%3 by lwz %0,-4(%4).
> 
> Doesn't help, realize that we are going to have "r3" with a pointer to 
> pte.  There is no way w/o an add to get to the next word for the lwarx.

I'd have to see the context. One less parameter to an asm block may
also make the compiler life easier. 

> 
> >But I'm not sure that OOO cannot play tricks on you, what guarantees
> > that the lwz is done after lwarx?
> 
> I'm assuming since its a single asm block, gcc is not allowed to 
> reorder it.

Not GCC, but the hardware. If loads can pass loads and lwarx has
more internal housekeeping overhead (obviously) than lwz. Especially
in the case of a processor with 2 LSU:
- lwarx issued to LSU1
- lwz issued LSU2 in the same clock cycle

I'm not sure at all that that you are guaranteed not to get 
potentially stale data from the lwz on SMP. Loads are weekly 
ordered in general wrt each other and lwarx is no exception
AFAIR. The fact that the two words are guaranteed to be in 
the same cache line makes it extremely unlikely, but not 
impossible.

	Gabriel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: pte_update and 64-bit PTEs on PPC32?
  2005-04-08 21:04                   ` Gabriel Paubert
@ 2005-04-08 21:31                     ` Dan Malek
  2005-04-08 21:44                       ` Gabriel Paubert
  2005-04-08 23:32                     ` Kumar Gala
  1 sibling, 1 reply; 18+ messages in thread
From: Dan Malek @ 2005-04-08 21:31 UTC (permalink / raw)
  To: Gabriel Paubert
  Cc: linuxppc-dev list, Paul Mackerras, linux-ppc-embedded list


On Apr 8, 2005, at 5:04 PM, Gabriel Paubert wrote:

> ...... Loads are weekly
                               ^^^^^^
> ordered in general wrt each other and lwarx is no exception
> AFAIR.

Maybe that's my performance problem :-)


	-- Dan

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: pte_update and 64-bit PTEs on PPC32?
  2005-04-08 21:31                     ` Dan Malek
@ 2005-04-08 21:44                       ` Gabriel Paubert
  0 siblings, 0 replies; 18+ messages in thread
From: Gabriel Paubert @ 2005-04-08 21:44 UTC (permalink / raw)
  To: Dan Malek; +Cc: linuxppc-dev list, Paul Mackerras, linux-ppc-embedded list

On Fri, Apr 08, 2005 at 05:31:43PM -0400, Dan Malek wrote:
> 
> On Apr 8, 2005, at 5:04 PM, Gabriel Paubert wrote:
> 
> >...... Loads are weekly
>                               ^^^^^^
> >ordered in general wrt each other and lwarx is no exception
> >AFAIR.
> 
> Maybe that's my performance problem :-)

Good catch ;-) 

I believe it's my funniest typo to date.

	Gabriel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: pte_update and 64-bit PTEs on PPC32?
  2005-04-08 21:04                   ` Gabriel Paubert
  2005-04-08 21:31                     ` Dan Malek
@ 2005-04-08 23:32                     ` Kumar Gala
  1 sibling, 0 replies; 18+ messages in thread
From: Kumar Gala @ 2005-04-08 23:32 UTC (permalink / raw)
  To: Gabriel Paubert
  Cc: linuxppc-dev list, Paul Mackerras, linux-ppc-embedded list


On Apr 8, 2005, at 4:04 PM, Gabriel Paubert wrote:

> On Fri, Apr 08, 2005 at 02:01:13PM -0500, Kumar Gala wrote:
>  > >Now that I read it carefully, I realize that I was wrong. But =
there
>  > >is still some room for optimization; the parameter that you don't
>  > >need is %3: simply replace lwzx %0,0,%3 by lwz %0,-4(%4).
> >
> > Doesn't help, realize that we are going to have "r3" with a pointer=20=

> to
> > pte.=A0 There is no way w/o an add to get to the next word for the=20=

> lwarx.
>
> I'd have to see the context. One less parameter to an asm block may
>  also make the compiler life easier.

The only thing we could do is make the 4 a constant param and change=20
the lwarx to use it.. not sure if thats any better than what we are=20
doing.

>  >
> > >But I'm not sure that OOO cannot play tricks on you, what =
guarantees
>  > > that the lwz is done after lwarx?
>  >
> > I'm assuming since its a single asm block, gcc is not allowed to
> > reorder it.
>
> Not GCC, but the hardware. If loads can pass loads and lwarx has
>  more internal housekeeping overhead (obviously) than lwz. Especially
>  in the case of a processor with 2 LSU:
>  - lwarx issued to LSU1
>  - lwz issued LSU2 in the same clock cycle
>
> I'm not sure at all that that you are guaranteed not to get
> potentially stale data from the lwz on SMP. Loads are weekly
> ordered in general wrt each other and lwarx is no exception
>  AFAIR. The fact that the two words are guaranteed to be in
> the same cache line makes it extremely unlikely, but not
> impossible.

You are correct, I guess I really need an eieio in between the lwarx=20
and lwzx

- kumar=

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: pte_update and 64-bit PTEs on PPC32?
  2005-04-08 18:44               ` Gabriel Paubert
  2005-04-08 19:01                 ` Kumar Gala
@ 2005-04-09  0:32                 ` Paul Mackerras
  1 sibling, 0 replies; 18+ messages in thread
From: Paul Mackerras @ 2005-04-09  0:32 UTC (permalink / raw)
  To: Gabriel Paubert; +Cc: linuxppc-dev list, linux-ppc-embedded list

Gabriel Paubert writes:

> But I'm not sure that OOO cannot play tricks on you, what guarantees
> that the lwz is done after lwarx?

Nothing, but it doesn't matter, because we have the
mm->page_table_lock, and anything that is changing the top 32 bits, or
anything in the bottom 32 bits other than the _PAGE_HASHPTE bit,
must also take the mm->page_table_lock.  The low-level hash_page
routine can change the _PAGE_HASHPTE bit without having that lock,
which is why we need the atomic sequence.

Paul.

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2005-04-09  0:32 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-04-06  6:51 pte_update and 64-bit PTEs on PPC32? Kumar Gala
2005-04-06  6:53 ` Benjamin Herrenschmidt
2005-04-06 16:44   ` Kumar Gala
2005-04-06 17:20     ` Chris Friesen
2005-04-06 17:58       ` Kumar Gala
2005-04-06 21:33         ` Kumar Gala
2005-04-08  8:26           ` Gabriel Paubert
2005-04-08 14:08             ` Kumar Gala
2005-04-08 18:44               ` Gabriel Paubert
2005-04-08 19:01                 ` Kumar Gala
2005-04-08 21:04                   ` Gabriel Paubert
2005-04-08 21:31                     ` Dan Malek
2005-04-08 21:44                       ` Gabriel Paubert
2005-04-08 23:32                     ` Kumar Gala
2005-04-09  0:32                 ` Paul Mackerras
2005-04-06 22:22     ` Benjamin Herrenschmidt
2005-04-06 22:27       ` Kumar Gala
2005-04-07 11:15     ` Benjamin Herrenschmidt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).